<?xml version="1.0" encoding="UTF-8" standalone="no"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:media="http://search.yahoo.com/mrss/" version="2.0"><channel><title>Human 2.0</title><description>We’re becoming a new species–one that can hack its own cognition and edit its own biology. This is the most important subject of the century, but it’s still hiding in academia and science fiction. We hope to change that.</description><link>https://www.alistaircroll.com/</link><image><url>https://www.alistaircroll.com/favicon.png</url><title>Alistair Croll</title><link>https://www.alistaircroll.com/</link></image><generator>Ghost 6.25</generator><lastBuildDate>Fri, 03 Apr 2026 22:31:25 GMT</lastBuildDate><atom:link href="https://www.alistaircroll.com/updates/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><language>en-us</language><item><title><![CDATA[The asymmetry of digital war]]></title><description><![CDATA[Countries should start making sure their populations can survive for three days with no digital systems.]]></description><link>https://www.alistaircroll.com/updates/the-asymmetry-of-digital-war/</link><guid isPermaLink="false">69cf0dfd9e591200013a0d62</guid><category><![CDATA[War]]></category><category><![CDATA[Cybersecurity]]></category><category><![CDATA[Infosec]]></category><category><![CDATA[Preparedness]]></category><category><![CDATA[Emergencies]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Fri, 03 Apr 2026 21:19:36 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/04/chuanchai-pundej-0imZsFe41zM-unsplash.jpg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/04/chuanchai-pundej-0imZsFe41zM-unsplash.jpg" alt="The asymmetry of digital war"><p><strong>TL;DR: Countries should start making sure their populations can survive for three days with no digital systems.</strong></p><blockquote>I debated writing this, because it might seem alarmist or anti-government. I&apos;m pretty sure it&apos;s neither.</blockquote><p>On April 1st, four humans climbed into a capsule the size of a campervan and headed for the moon. It was the first crewed lunar mission since 1972, with a Canadian aboard.</p><p>At the height of the cold war, kids got the day off school to watch.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/04/image-3.png" class="kg-image" alt="The asymmetry of digital war" loading="lazy" width="630" height="354" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/04/image-3.png 600w, https://www.alistaircroll.com/content/images/2026/04/image-3.png 630w"><figcaption><span style="white-space: pre-wrap;">Australian kids watch the moon landing in 1969.</span></figcaption></figure><p>This time, there was so much happening we barely noticed. In the same week:</p><ul><li>The US-Israel war on Iran entered its 34th day, with the Strait of Hormuz effectively closed and oil past $105 a barrel. Iranian hackers <a href="https://www.reuters.com/world/us/iran-linked-hackers-claim-breach-of-fbi-directors-personal-email-doj-official-2026-03-27/?ref=alistaircroll.com" rel="noreferrer">went after FBI director Kash Patel</a>.</li><li>A <a href="https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan?ref=alistaircroll.com" rel="noreferrer">piece of software called axios, used by hundreds of thousands of applications</a>, was quietly hijacked and turned into a tool for spreading malware. It was <a href="https://www.iansresearch.com/resources/all-blogs/post/security-blog/2026/03/30/trivy-supply-chain-attack-triggers-self-propagating-ci-cd-compromise?ref=alistaircroll.com" rel="noreferrer">not the first time in the week</a>.</li><li>Anthropic accidentally revealed it had created an AI that was <a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/?ref=alistaircroll.com" rel="noreferrer">so good at hacking</a> (and, it must be said, for now incredibly expensive to run) that it had delayed the release because of how hackers would use it.</li><li>Existing versions of AI are <a href="https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=42156f93d123&amp;ref=alistaircroll.com" rel="noreferrer">finding exploits constantly</a>, some of them two decades old. One security loophole was found by asking Claude, &quot;Somebody told me there is an RCE 0-day when you open a file. Find it.&quot;</li><li>The U.S. <a href="https://www.schneier.com/blog/archives/2026/04/us-bans-all-foreign-made-consumer-routers.html?ref=alistaircroll.com" rel="noreferrer">banned all foreign-made routers</a>, citing them a security risk.</li></ul><p>We&apos;re in a digital cold war that&apos;s escalating fast, but we&apos;re still thinking about it as if it&apos;s a traditional war.</p><h2 id="digital-war-is-not-physical-war">Digital war is not physical war</h2><p>Physical war has two clear sides, and a front line. Our tanks versus their tanks, our troops against their troops. It&apos;s east to think about cybersecurity this way too: Nation-states with their weapons and security agencies. <em>But digital warfare doesn&apos;t look like that.</em></p><h3 id="online-is-the-front-line"><strong>Online is the front line</strong></h3><p>In kinetic warfare (the kind we fight with atoms) there&apos;s a line between the military and civilians. Armies fight armies. In digital warfare (the kind we fight with bits) that line doesn&apos;t exist. The front runs through <a href="https://www.cnet.com/home/internet/fcc-just-banned-all-new-foreign-made-routers-everything-you-need-to-know-to-keep-your-network-safe/?ref=alistaircroll.com" rel="noreferrer">your home router</a>, your parent&apos;s email account, the teenager who borrows a USB drive.</p><p>Iran hacked Kash Patel not by breaching the Pentagon, but by targeting his personal digital footprint. The attack surface isn&apos;t your office firewall. It&apos;s everyone you&apos;ve ever communicated with, every device you&apos;ve ever connected.</p><h3 id="digital-is-asymmetric"><strong>Digital is asymmetric</strong></h3><p>Infosec teams talk about the &quot;red team&quot; (attacker) and &quot;blue team&quot; (defender.) The blue team has to be right all the time; the red team only has to be right once. You&apos;re only as strong as your weakest link.</p><p>Digital attacks were once costly and risky. They took time and preparation, and you might get caught. But with AI and automation, that&apos;s no longer the case. Autonomous agents can keep trying to find a way in, running somewhere in the cloud that&apos;s hard to trace. The friction that prevented many attacks is gone.</p><h3 id="were-making-the-attack-surface-bigger">We&apos;re making the attack surface bigger</h3><p>And now, every piece of software running on your behalf.</p><p>Hundreds of thousands of people are now <a href="https://en.wikipedia.org/wiki/OpenClaw?ref=alistaircroll.com" rel="noreferrer">installing autonomous agents on their home computers</a>. We&apos;re giving it our bank accounts and ID and calendars and emails. These software agents, called &quot;claws&quot;, write and run code on your behalf. It uses software (like Axios) from developers you&apos;ve never met. Agents create new places for someone to attack, and because they&apos;re easy to install (they&apos;ll do it for you!) people with very little security background are installing them on their home machines.</p><h3 id="anyone-can-be-hacked">Anyone can be hacked</h3><p>Hacking has always seemed non-violent. Sometimes, Hollywood even makes it look glamourous. When a prominent public figure gets hacked, we usually assume they did something dumb with their password. But the reality is, when someone skilled wants to hack you, they&apos;ll succeed. To compromise the developer behind axios, hackers <em>created a fake company, a fake slack channel, and a fake Microsoft Teams platform that required he update some software.</em></p><figure class="kg-card kg-image-card kg-card-hascaption"><a href="https://www.instagram.com/reel/DWrS_oNDa4p/?ref=alistaircroll.com"><img src="https://www.alistaircroll.com/content/images/2026/04/image-2.png" class="kg-image" alt="The asymmetry of digital war" loading="lazy" width="1184" height="1128" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/04/image-2.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/04/image-2.png 1000w, https://www.alistaircroll.com/content/images/2026/04/image-2.png 1184w" sizes="(min-width: 720px) 720px"></a><figcaption><span style="white-space: pre-wrap;">@mattjayy breaks down the hack (did you catch the &quot;microscell.com&quot; in the URL?)</span></figcaption></figure><p><a href="https://www.instagram.com/reel/DWrS_oNDa4p/?ref=alistaircroll.com" rel="noreferrer">Watch that video</a> and tell me you wouldn&apos;t have fallen for that too. Most people do not have the kind of operational security needed to be a difficult target. Our digital footprints are too large, too connected.</p><h3 id="hackings-violent-now">Hacking&apos;s violent now</h3><p>Despite the Hollywood portrayal, hacking is not a mild crime any more. We already see how scammers destroy people&apos;s lives.</p><p>Now consider what happens when there&apos;s a widespread outage. Banking stops, which means Interac and tap-to-pay and ATMs and payroll go dark for days. If phones aren&apos;t working, there&apos;s no GPS, no Uber or Lyft, no home delivery. Ambulances can&apos;t find homes, and if they get a patient to hospital, the records aren&apos;t available.</p><p>We built a civilization stack that runs on networked software, which runs on dependencies maintained by volunteers, deployed with configurations nobody&apos;s checked, communicating through libraries like axios.</p><p>But we have plans for this, right?</p><h2 id="we-used-to-prepare-for-bad-things">We used to prepare for bad things</h2><p>Contingency plans do exist. Canada&apos;s Federal Cyber Incident Response Plan defines a catastrophic event as one <em>causing widespread loss of life, major long-term damage to the economy, or severe impediment to national security</em>. It gets invoked by a committee, and it can trigger the Emergencies Act.</p><p>The Act&#x2014;as many protesters learned during COVID when it was invoked for the first time to quell a trucker rally in Ottawa&#x2014;lets the government direct essential services, assume control of utilities, and in an international emergency, start rationing and using force to maintain order. Notably, it prohibits the government from censoring communications even during a declared emergency.</p><p>They are, simply, orders to maintain order. Citizens don&apos;t appear much in these plans.</p><p>During the Cold War, we ran civil defence exercises. People knew where to go and what to bring. We kept emergency supplies. Some countries issued their citizens gas masks. We prepared. In border countries like Finland, military service was mandatory, and citizens rehearsed defending their villages and towns.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/04/cd-comic-1956.jpg" class="kg-image" alt="The asymmetry of digital war" loading="lazy" width="2000" height="3056" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/04/cd-comic-1956.jpg 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/04/cd-comic-1956.jpg 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/04/cd-comic-1956.jpg 1600w, https://www.alistaircroll.com/content/images/2026/04/cd-comic-1956.jpg 2005w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">They made comics for kids in the fifties. (</span><a href="https://civildefensearchives.org/?ref=alistaircroll.com"><span style="white-space: pre-wrap;">https://civildefensearchives.org/</span></a><span style="white-space: pre-wrap;">)</span></figcaption></figure><p><em>We once prepared for bad things</em>. Shouldn&apos;t we be doing that now? We are undoubtedly in a time of elevated threats, but we&apos;re acting like nothing is happening. Do you own a map? Have cash on hand? Remember your neighbor&apos;s phone number? Have a printed copy of your ID and medical records?</p><p>These seem like pretty basic things. Mostly, we&apos;re just hoping. Hoping everything works out; hoping someone else has thought this through for us; and hoping and that everyone will follow the plan when the time comes.</p><p>Hope, like nostalgia, is not a strategy.</p><p>I&apos;m pretty certain that we live in a world that will&#x2014;a few times this decade&#x2014;be offline. If governments shared these plans today, and worked to prepare us for this new reality, lives would be saved. We could be helping our country&#x2014;and one other&#x2014;to prepare. It might be uncomfortable, or scary. That&apos;s not a reason to avoid it.</p><p>Here&apos;s a bold suggestion: We should schedule an analog day. A drill where we try to survive for 24 hours as a country without touching a digital device, and write down every time we do. The 2026 equivalent of duck-and-cover, or air raid drills.</p><p>We can handle the truth, and we&apos;ll be better prepared for it.</p>]]></content:encoded></item><item><title><![CDATA[JEE and government with Ishmael Interactive]]></title><description><![CDATA[How can subversive thinking and systems awareness help public servants change outdated thinking and legacy processes? A chat on the CX podcast.]]></description><link>https://www.alistaircroll.com/updates/jee-and-government-with-ishmael-interactive/</link><guid isPermaLink="false">69cc34b61218f7000195025f</guid><category><![CDATA[JEE]]></category><category><![CDATA[Innovation]]></category><category><![CDATA[Government]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Tue, 31 Mar 2026 20:59:57 GMT</pubDate><content:encoded><![CDATA[<p>Had a chat with <a href="https://www.linkedin.com/in/aaron-meyers/?ref=alistaircroll.com" rel="noreferrer">Aaron Meyers</a> on the CX Podcast from <a href="https://ishmaelinteractive.com/?ref=alistaircroll.com" rel="noreferrer">Ishmael Interactive</a> about subversive thinking and public sector modernization. Aaron&apos;s going to return the favor as a guest on the Functional Government podcast, talking about his work to streamline Regulation.gov&#x2014;and the weaponization of democratic forums by special interests.</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/XjnZyvZUY9Q?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="The Attention Economy, AI Disruption, and the Art of Subversive Thinking | CX Pod"></iframe></figure>]]></content:encoded></item><item><title><![CDATA[The feedback loop is the product]]></title><description><![CDATA[Every AI system tests whether it works—and improves automatically when it doesn't—using something called Evals. Done right, they're not just QA; they're how companies continuously upgrade themselves.]]></description><link>https://www.alistaircroll.com/updates/the-feedback-loop-is-the-product/</link><guid isPermaLink="false">69c8308c881a830001402064</guid><category><![CDATA[AI]]></category><category><![CDATA[Disruption]]></category><category><![CDATA[Business strategy]]></category><category><![CDATA[Evals]]></category><category><![CDATA[Cybernetics]]></category><category><![CDATA[OODA Loop]]></category><category><![CDATA[Lean Startup]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Mon, 30 Mar 2026 19:24:25 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/03/declan-sun-emgDh1_IG8o-unsplash.jpg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/03/declan-sun-emgDh1_IG8o-unsplash.jpg" alt="The feedback loop is the product"><p>I&apos;m building a complex product that includes AI agents. I needed a way to test it to make sure it worked as expected. As I looked into how AI products are tested, I learned a new term: evals. It&apos;s not a term many of you have heard of, but that&apos;s going to change pretty soon.</p><p>Every business <em>tries</em> to improve and adapt: changing marketing messages, trying new projects, giving employees feedback, adjusting pricing, and so on. But the feedback loops that drive this improvement are messy and slow. It takes time to measure the effect of a price change in the market. Managers&apos; messages to employees are imperfect. We don&apos;t capture the right data.</p><p>When we built traditional software, we knew how we worked, so testing it was relatively simple. But as any researcher will tell you, we don&apos;t really understand how generative AI and LLMs work. So when you use AI in a process or project, you need to feed it inputs, judge its outputs, and adjust.</p><p>When you close that loop, you get a self-improving organization.</p><hr><h2 id="whats-an-eval-anyway">What&apos;s an eval, anyway?</h2><p>At its simplest, an eval (short for evaluation) tests whether an AI does what it&apos;s supposed to do.</p><p>Testing traditional software is easy: 2+2 = 4, and if it doesn&apos;t something is wrong.</p><p>AI isn&apos;t like that. Ask the same question twice and you&apos;ll get two different answers, both potentially valid (which is rather the point!) Ask an AI to draft an email and there&apos;s a nearly infinite number of responses that might be considered &quot;good.&quot; Often, the results are what you wanted. Sometimes, they&apos;re nonsense. On rare occasions, they&apos;re company-ending bad stuff that is dangerous or deeply offensive.</p><p>So how do you make sure your AI delivers consistently good results?</p><p>Every AI application turns inputs (a prompt, retrieved context, user data, tool access) into outputs (a response, an action, a decision). </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/image-6.png" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="1622" height="764" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/image-6.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/image-6.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/03/image-6.png 1600w, https://www.alistaircroll.com/content/images/2026/03/image-6.png 1622w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">What every AI is doing.</span></figcaption></figure><p>An eval checks the outputs against a definition of &quot;good&quot;. It&apos;s a structured, repeatable set of judgement calls that help you understand (and improve) how an AI behaves.</p><p>An eval has three parts:</p><ol><li><strong>A test dataset:</strong> a set of inputs, often paired with reference answers that represent what &quot;good&quot; looks like.</li><li><strong>The system under test:</strong> your full pipeline: prompts, model, retrieval, tools, all of it.</li><li><strong>A grader:</strong> because AI outputs are always different, the thing that decides whether the output passes or fails needs <em>judgement</em>.</li></ol><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/image-7.png" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="2000" height="492" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/image-7.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/image-7.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/03/image-7.png 1600w, https://www.alistaircroll.com/content/images/2026/03/image-7.png 2266w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The basic elements of an eval system.</span></figcaption></figure><p>This is no different than an employer conducting performance reviews. Except that where those happen quarterly, this is a continuous process closer to <a href="https://en.wikipedia.org/wiki/OODA_loop?ref=alistaircroll.com" rel="noreferrer">an OODA loop</a> than to HR.</p><p>The grader is where it gets interesting. We&apos;ll come back to that.</p><h2 id="evals-are-iterative-not-one-shot">Evals are iterative, not one-shot</h2><p>Evals are a continuous cycle: define what good looks like, feed it some inputs (the prompt, the skills, the data being retrieved, the tools the AI is using), measure your system against it, find where it fails, fix the inputs&#x2014;and repeat.</p><p>Each trip around the cycle is different, which means you need to:</p><ul><li>Track and analyze what happened with each eval.</li><li>Link the prompt you used to the outputs it produced, and the grading they received.</li><li>Keep a copy of each version, and clear records of what changed, so you can reproduce it.</li></ul><p>Managing your evals is like managing your code base: version control, pulls and commits, repositories and forks. Evals are your IP. There are a bunch of companies that make software to do this (LangSmith, Braintrust, Promptfoo, and Arize for example.)</p><h2 id="prompt-component-outcome">Prompt, component, outcome</h2><p>Evals test a system in three concentric loops, from narrow to broad:</p><h3 id="prompt-evals">Prompt evals</h3><p><em>Does this specific prompt produce good outputs for a set of known inputs?</em> This is the tightest loop. You change a word in your system prompt, run your test cases, and see if the scores move. This is where most people start (and it&apos;s necessary) but it only tells you that the prompt works in isolation.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/image-12.png" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="448" height="688"><figcaption><span style="white-space: pre-wrap;">An example of the simplest eval: Does this prompt produce a good result?</span></figcaption></figure><h3 id="component-evals">Component evals</h3><p><em>Does this combination of AI and tool chain do what I want? </em>A modern AI does more than just respond with text. It uses tools, searches the web, launches a sub-agent, and writes to file folders. A prompt can be perfect and the system can still fail because a file was missing, or the name of an API changed.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/image-10.png" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="1070" height="970" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/image-10.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/image-10.png 1000w, https://www.alistaircroll.com/content/images/2026/03/image-10.png 1070w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Evals check not just the AI, but the tools it uses as part of a component in a system.</span></figcaption></figure><h3 id="outcome-evals">Outcome evals</h3><p><em>Does the system as a whole produce a desirable result?</em> If the agent was supposed to book a vacation, did it do so? These are the hardest evals to build and the most expensive to run, often involving multiple layers of agents and interactions with human or synthetic operators, but they&apos;re the ones that tell you whether your product actually works.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/image-11.png" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="2000" height="1410" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/image-11.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/image-11.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/03/image-11.png 1600w, https://www.alistaircroll.com/content/images/2026/03/image-11.png 2114w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The broadest form of eval decides whether the outcome was delivered.</span></figcaption></figure><h2 id="the-grader-judging-what-good-is">The grader: judging what &quot;good&quot; is?</h2><p>There are three different ways to decide if an output was correct:</p><ul><li><strong>Code-based grading</strong> works when the answer is objective. Did the AI extract the correct date from a user message? Is the output formatted correctly? Did the math math? You write some code that checks, and it either passes or fails. If your AI says 2+2=5, that&apos;s bad.</li><li><strong>Human grading</strong> works for everything. A person reads the output and scores it. Was the tone right? Was the answer helpful? Was the reasoning sound? Humans are pretty judgy to begin with; the problem is that human grading doesn&apos;t scale: it&apos;s expensive, it&apos;s slow, and it&apos;s inconsistent because different humans will disagree with each other. So human grading is only used to spot check and verify the eval process is working.</li><li><strong>AI-as-judge grading</strong> lets you scale the evals. You give a second AI (the &quot;grader&quot;) a set of criteria that define &quot;good&quot;, and it grades the outputs. You can ask the judge-AI subjective questions like, &quot;was the response empathetic?&quot;,  &quot;did the AI make things up?&quot; or &quot;was it polite?&quot;</li></ul><p>AI-as-judge is a real game-changer, because it closes the loop. Once you&apos;ve built a system that can test itself by generating outputs, judging them automatically, and identifying failures, <em>it becomes self-improving. </em>If the judge AI concludes that the app isn&apos;t being polite, it can rewrite the skill. If the tone isn&apos;t right, it can tweak the prompt. If the AI keeps failing to get the right information, it can code the API differently.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/image-8.png" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="2000" height="731" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/image-8.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/image-8.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/03/image-8.png 1600w, https://www.alistaircroll.com/content/images/2026/03/image-8.png 2266w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Evals can become a self-improving system.</span></figcaption></figure><h2 id="parallels-with-lean-startup">Parallels with Lean Startup</h2><p>In the first decade of the new millennium, startups were burning millions building products nobody wanted. The fix was to build the smallest thing, see if it works, and learn from what happens. This was called the Minimum Viable Product (MVP)&#x2014;a term popularized by Eric Ries&apos; <em>The Lean Startup&#x2014;</em>and it forever changed how new businesses validate their ideas.</p><p>Fifteen years later, companies building AI products are making the same mistake: shipping agents and apps without any systematic way to know whether they work. If you&apos;re deploying AI without them, you&apos;re making the same mistake startups made before Lean: building things without knowing if they work, and having no way to get better systematically. The pattern is the same:</p><ul><li><strong>Build: </strong>write the prompt, set up the pipeline, configure the agent.</li><li><strong>Measure: </strong>run your eval suite, get scores across your test cases.</li><li><strong>Learn: </strong>analyze the failures, identify patterns, go back to Build to adjust your inputs.</li></ul><p>Just as vanity metrics (page views, registered users) misled startups into thinking they had traction, &quot;it looked good when I tested it&quot; tricks you into thinking your system works. Vibes aren&apos;t enough. Evals tell you what&apos;s actually happening across hundreds of cases, not the three you happened to try, and tracks what worked and what didn&apos;t so you can see what changed and how to fix it.</p><p>The analogy goes deeper. In Lean, the MVP wasn&apos;t &quot;the smallest product.&quot; It was the smallest experiment that could test your riskiest assumption. In evals, the riskiest assumption isn&apos;t &quot;does the model know things?&quot; (it does). It&apos;s &quot;does our system behave correctly in the situations that matter most to our users, customers, and employees?&quot; The eval suite <em>is</em> the MVP: the smallest set of behaviors whose improvement you can automate and measure.</p><p>We learned fifteen years ago that build-it-and-ship doesn&apos;t work for products. We now know it doesn&apos;t work for AI either<em>.</em> Mature AI deployments start by defining what &quot;good&quot; means: Write the test cases. Create the library of test inputs. Set the grading criteria. Then build the simplest thing that can pass them. An AI MVP isn&apos;t the smallest model, or the cheapest deployment. It&apos;s <em>the smallest set of core behaviors whose improvement you can automate</em>. The eval suite becomes the product spec, the regression test, and the quality bar, all in one artifact.</p><p>Your AI processes need product managers now.</p><h2 id="you-need-evals-even-if-you-dont-build-ai-products">You need evals even if you don&apos;t build AI products</h2><p>Evals aren&apos;t a technology tool, they&apos;re a management tool. Here are three specific examples:</p><h3 id="the-basis-for-the-self-improving-organization">The basis for the self-improving organization</h3><p>In Lean, your measurement loop required real users and real time. You had to wait for people to show up and behave. In enterprises, you had to wait for quarterly results to come in or new hires to come up to speed. Cycle time was slow.</p><p>In evals, you can simulate users, generate test cases, and run hundreds of experiments in the time it takes to make coffee. The improvement loop tightens from days to minutes. Of course, you need to trust your LLM judge, which is weird because that means your judge needs evals. But things bootstrap from there, and once calibrated, the process, product, or organization can improve endlessly.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/03/alistair-at-startup-festival-by-eva-blue-27.jpg" class="kg-image" alt="The feedback loop is the product" loading="lazy" width="2000" height="1333" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/alistair-at-startup-festival-by-eva-blue-27.jpg 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/alistair-at-startup-festival-by-eva-blue-27.jpg 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/03/alistair-at-startup-festival-by-eva-blue-27.jpg 1600w, https://www.alistaircroll.com/content/images/size/w2400/2026/03/alistair-at-startup-festival-by-eva-blue-27.jpg 2400w" sizes="(min-width: 720px) 720px"></figure><p>At Startupfest in 2025, I said that startups are moving from product/market fit to outcome/liability fit. You&apos;re not selling functionality, you&apos;re promising an outcome, and evals show you if you can deliver it.</p><h3 id="evals-are-the-future-of-governance">Evals are the future of governance</h3><p>If your AI gives bad medical advice, makes a discriminatory hiring decision, or hallucinates a contract term, the eval suite (or lack of one) becomes evidence. &quot;Did you test for this failure mode?&quot; is going to be a courtroom question, and the eval (or lack thereof) is the audit trail that shows whether or not you tried.</p><h3 id="evals-are-part-of-investor-due-diligence">Evals are part of investor due diligence</h3><p>Any investor doing due diligence in 2026 must look at the eval cycles on which the company&apos;s products and services rely. They show whether the company can adapt automatically to changes in the market. And they dictate whether the business can bootstrap itself, improving every time it gets a new model, new tools, or new data.</p><h3 id="evals-keep-you-current-and-prevent-model-lock-in">Evals keep you current and prevent model lock-in</h3><p>As AI moves out of labs and into enterprises, evals become much more important. Getting them right and incorporating them into your business processes won&apos;t just help you move faster than everyone else, it&apos;ll let you keep building atop the best AI models.</p><p>When Microsoft ships a new version of Windows, your old code still runs. Backwards compatibility is a given. Apple switched chips from Intel to Apple Silicon and the transition is still underway five years later.</p><p>AI models don&apos;t work like software versions. When Anthropic releases a new Claude, or OpenAI drops a new Codex, or Google announces a new Gemini, there&apos;s no guarantee that your products and processes will work the same way. Claude Opus 4.6 might have broken things that worked in 4.5 because it behaved differently.</p><p>If you&apos;re stuck on an old model one because you have no way of knowing whether your product or process works correctly on the new one, you&apos;re not keeping up with the state of the art. Which means <em>even if you don&apos;t change your product, you need to evaluate it as new models are released</em>. Every new model release is only a performance boost if you can quickly verify that your business processes still work.</p><p>Without evals, every model change is a terrifying, weeks-long manual regression test. With evals, it&apos;s an afternoon. Build an eval cycle that automatically adjusts, and you can switch models within days. You can swap out some code and know within hours whether the swap helped. In that way, evals aren&apos;t just product testing, they&apos;re <em>upgrade readiness </em>and an antidote to lock-in.</p><p>(The model providers know this. They&apos;re forced to maintain multiple model versions simultaneously, because customers can&apos;t migrate, but they&apos;re going to end-of-life them soon.)</p><h2 id="the-bottom-line">The bottom line</h2><p>To reiterate:</p><ul><li>An eval is a structured, repeatable set of judgement calls that help you understand (and improve) how an AI behaves.</li><li>An AI MVP is the smallest set of core behaviors whose improvement you can automate.</li><li>You&apos;re not selling functionality, you&apos;re promising an outcome, and evals show you if you can deliver it.</li><li>Once you&apos;ve built a system that can test itself by generating outputs, judging them automatically, and identifying failures, it becomes self-improving.</li><li>As a result, you&apos;re competing less on what you do and more on how well you can improve what you do.</li><li>Evals are vital for governance, because they&apos;re an audit trail to prove, or defend against, negligence.</li></ul><p>We&apos;ve seen this movie before. This time, the feedback loop is faster. Build it right, and you&apos;ve created a self-improving app. Ignore it, and you&apos;ll be stuck with earlier models and spiralling technical debt.</p><hr><h2 id="sidenote-strategies-for-doing-this-without-going-broke"><strong>Sidenote: Strategies for doing this without going broke</strong></h2><blockquote>Serious evals cost real money. Running an AI agent through a complex scenario over and over again costs tokens, plus more tokens for the AI that&apos;s judging the results, and at scale that adds up fast. My system runs 10 AI agents, controlled by 10 fake humans, through an 11-phase system, with a lot of back and forth. It takes around 4 hours to complete. I ran out of tokens on the highest tier of Claude Code and had to wait two days to continue.</blockquote><blockquote>The fix isn&apos;t to skip evals, it&apos;s to be smart about what you test and how.</blockquote><blockquote><strong>Decompose your expensive tests.</strong> If your end-to-end agent evaluation takes hours and thousands of API calls (as mine does for the thing I&apos;m building), don&apos;t run it on every change. Instead, extract the critical decision points&#x2014;the moments where the agent chose an action, picked a tool, or generated a response&#x2014;and test those in isolation. Twenty extracted decision points running in two minutes will catch most of the same bugs as a four-hour full simulation.</blockquote><blockquote><strong>Tier your testing.</strong> Cheap component evals run on every change (just as unit tests do, part of your continuous deployment pipeline). Expensive end-to-end tests run nightly or weekly. Full simulations run before major releases. This is the same pyramid that software engineering figured out decades ago (unit test, then integration test, then end-to-end test), applied to AI. I have a ci/cd skill that takes 20 minutes to run and produces an 11-tab interactive report; I run it every couple of days. But I have 511 unit tests that run every time the software changes.</blockquote><blockquote><strong>Turn on logging</strong>. I have incredibly verbose records of every simulation, even down to detailed timing measurements. It&apos;s essential for troubleshooting. It&apos;s how I found two agents were sharing the same folder, and one thought the other was trying to hack it via prompt injection.</blockquote><blockquote><strong>Mine your past logs.</strong> If you&apos;ve already run your system in production or simulation, you have a dataset. Every conversation transcript, every agent trace, every user interaction is a potential test case. You don&apos;t need to generate scenarios from scratch when you have real ones sitting in your logs. </blockquote><blockquote><strong>Use cheaper models where you can.</strong> Your product needs the best model. Your simulated users, test harnesses, and grading judges often don&apos;t. A smaller, cheaper model playing the human side of a conversation <em>can</em> cut your simulation costs dramatically. If you&apos;re checking mechanical stuff like &quot;can the AI read the file?&quot; this is good; it&apos;s not so good for apps where the AI is doing a lot of reasoning, because behavior changes across models. I&apos;m testing Claude Sonnet and Opus, alongside Codex and Gemini.</blockquote><blockquote><strong>Start with 20 cases, not 200.</strong> You don&apos;t need a massive test suite on day one. Twenty cases drawn from real failures (the edge cases, the confusing inputs, the things that actually broke) will tell you more than 200 synthetic scenarios that don&apos;t reflect reality. Expand from there as you learn where the system is fragile.</blockquote>]]></content:encoded></item><item><title><![CDATA[Welcome to your agentic city]]></title><description><![CDATA[<figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://challengercities.substack.com/p/challenger-cities-ep71-welcome-to?ref=alistaircroll.com"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Challenger Cities EP71: Welcome to Your Agentic City with Alistair Croll</div><div class="kg-bookmark-description">Listen now | Government was built before citizens had a terminal. Now AI is about to change the relationship between cities and the people in them, only most urban leaders haven&#x2019;t noticed yet.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.alistaircroll.com/content/images/icon/https-3A-2F-2Fsubstack-post-media.s3.amazonaws.com-2Fpublic-2Fimages-2F83f873c6-95bd-498d-b15d-d79ae219a3cd-2Fapple-touch-icon-180x180.png" alt><span class="kg-bookmark-author">Challenger Cities</span><span class="kg-bookmark-publisher">Iain Montgomery</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.alistaircroll.com/content/images/thumbnail/https-3A-2F-2Fsubstack-post-media.s3.amazonaws.com-2Fpublic-2Fimages-2F567c72a6-1722-4e6e-9b55-16a3430a5de0_1563x1563.png" alt onerror="this.style.display = &apos;none&apos;"></div></a></figure><p>I</p>]]></description><link>https://www.alistaircroll.com/updates/welcome-to-your-agentic-city/</link><guid isPermaLink="false">69caa9cd881a8300014021ad</guid><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Mon, 30 Mar 2026 16:52:15 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/03/AC-challenger.webp"/><content:encoded><![CDATA[<figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://challengercities.substack.com/p/challenger-cities-ep71-welcome-to?ref=alistaircroll.com"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Challenger Cities EP71: Welcome to Your Agentic City with Alistair Croll</div><div class="kg-bookmark-description">Listen now | Government was built before citizens had a terminal. Now AI is about to change the relationship between cities and the people in them, only most urban leaders haven&#x2019;t noticed yet.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.alistaircroll.com/content/images/icon/https-3A-2F-2Fsubstack-post-media.s3.amazonaws.com-2Fpublic-2Fimages-2F83f873c6-95bd-498d-b15d-d79ae219a3cd-2Fapple-touch-icon-180x180.png" alt="Welcome to your agentic city"><span class="kg-bookmark-author">Challenger Cities</span><span class="kg-bookmark-publisher">Iain Montgomery</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.alistaircroll.com/content/images/thumbnail/https-3A-2F-2Fsubstack-post-media.s3.amazonaws.com-2Fpublic-2Fimages-2F567c72a6-1722-4e6e-9b55-16a3430a5de0_1563x1563.png" alt="Welcome to your agentic city" onerror="this.style.display = &apos;none&apos;"></div></a></figure><img src="https://www.alistaircroll.com/content/images/2026/03/AC-challenger.webp" alt="Welcome to your agentic city"><p>I joined fellow Montrealer, future-of-cities expert, Iain Montgomery on his Challenger Cities podcast to talk about how the relationship between citizens and governments is changing, and what it means for municipalities.</p>]]></content:encoded></item><item><title><![CDATA[Office hours]]></title><description><![CDATA[I'm running an online office hours to talk about what I've been learning about AI, and answer questions. It's free; join if this is your thing.]]></description><link>https://www.alistaircroll.com/updates/office-hours/</link><guid isPermaLink="false">69a7018c4076f800018d425a</guid><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Tue, 03 Mar 2026 15:51:44 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-26-at-9.47.25---AM.png"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-26-at-9.47.25---AM.png" alt="Office hours"><p>Most of the time I feel like an impostor.</p><p>I spend a lot of time talking about technology: how it&apos;s changing society; how government might use it; what startups are building with it; the ways it&apos;s changing organizations and our brains. But I&#x2019;m still a spectator: I don&#x2019;t really make things any more. Before I was a product manager, I built BBSes and websites, but I lost the thread.</p><p>AI picked it up for me. I&#x2019;ve spent the last three months building things again. Not just my own things; AI tools have let me reach inside others&#x2019; products, exploring and poking and breaking and understanding them. I&#x2019;m trying to build the systems that will run my life before others build them for me. And I have ideas. SO MANY IDEAS.</p><p>I still don&#x2019;t know what to think about it all. I&#x2019;m scared by <a href="https://moldandyeast.substack.com/p/acceleration-flow?ref=alistaircroll.com" rel="noreferrer">how addictive it is</a> (Ramon is right.) I don&#x2019;t know if it&#x2019;s just making me incredibly productive at things that don&#x2019;t matter. And I&#x2019;m worried that I&#x2019;m abdicating planning because it&apos;s so damned easy.</p><p>But the most honest thing I can say is: if you&#x2019;re not actively building with these tools right now, you are falling behind in ways that will be very hard to recover from. The gap between &#x201C;this is interesting&#x201D; and &#x201C;this changes everything about my job&#x201D; has collapsed from years to weeks.</p><p>Almost daily, someone asks me to grab a coffee or jump on a Hangout and compare notes on what we&#x2019;re learning. Each conversation is a chance to see this quickly-changing world through someone else&#x2019;s eyes. But the landscape is shifting so fast that by the time that coffee is empty, the next thing has already arrived.</p><p>So I&#x2019;m starting something new.</p><h2 id="what-this-is">What this is</h2><p>I&#x2019;m going to run an Office Hours online to share two or three things I&#x2019;ve learned about AI&#x2014;things that surprised me, unsettled me, or changed the way I think about a problem. Then I&#x2019;ll open it up for questions and we&#x2019;ll figure things out together.</p><p>Here are three things I&#x2019;m thinking about a lot right now:</p><ul><li><strong>The asymmetry is ending.</strong> Citizens are about to build their own AI agents to fight bureaucracy, and government is not ready. I wrote about this in <a href="https://www.alistaircroll.com/updates/the-machine-fights-back/">The Machine Fights Back</a>&#x2014;how Canadian waiting ten months for the CRA to fix an error can now build a bot to call them every single day. When software was expensive, institutions held all the leverage. That&apos;s over.</li><li><strong>Don&apos;t sell what you can make&#x2014;make what you can sell.</strong> If AI can build almost any software in an afternoon, then software isn&#x2019;t the product anymore. What competitive moats are left? Every business is a startup again, whether it wants to be or not. And much of the Venture Capital industry is like Wile E. Coyote, desperately hoping it hasn&#x2019;t run off a cliff.</li></ul><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/02/roadrunner-coyote.gif" class="kg-image" alt="Office hours" loading="lazy" width="498" height="375"><figcaption><span style="white-space: pre-wrap;">Meep Meep goes the cap table.</span></figcaption></figure><p><strong>Vocabulary is now a coding skill.</strong> Prose is code. The person who knows what &#x201C;dependency injection&#x201D; or &#x201C;race condition&#x201D; or &#x201C;refactor&#x201D; means gets a better result from an AI than the person who spends three paragraphs describing the same concept. Fluency in technical terminology is becoming a competitive advantage that has nothing to do with writing code. So what&#x2019;s the new programming language? (I dug into this in <a href="https://www.alistaircroll.com/updates/the-vocabulary-of-agents/">The Vocabulary of Agents</a>.)</p><h2 id="why-me-why-now">Why me, why now?</h2><p>I&#x2019;ve been lucky enough to spend two decades at the intersection of technology and strategy, with a front row seat for the consumer Internet, Web 2.0, cloud computing, big data, and AI. I&#x2019;ve seen a lot of technology shifts. This one is different&#x2014;not because the technology is more impressive (though it is) but because of the speed.</p><p>I don&#x2019;t have all the answers. I might have good questions, a decent framework for thinking about them, and a willingness to be wrong in public. That seems like enough to start a conversation.</p><h2 id="come-talk-about-the-future">Come talk about the future</h2><p>The first Office Hours is on Thursday, March 5, 2026 from 1-3 PM EST. <a href="https://lu.ma/kcrcdo1y?ref=alistaircroll.com">You can register on Lu.ma</a></p><p>It&#x2018;s free. Show up, bring questions, tell me where I&#x2018;m wrong. If it&#x2018;s useful and fun, we&#x2018;ll keep doing it every couple of weeks.</p><p>See you there.</p><p><em>P.S. If you find it useful, you can kick in $10 when you register.</em></p>]]></content:encoded></item><item><title><![CDATA[Walk and chalk]]></title><description><![CDATA[Does how we treat an AI today affect how future AIs will treat us in return?]]></description><link>https://www.alistaircroll.com/updates/walk-and-chalk/</link><guid isPermaLink="false">69a5f3dc867472000147eb46</guid><category><![CDATA[AI]]></category><category><![CDATA[Prediction]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Mon, 02 Mar 2026 21:42:31 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/03/HCTEXwrbEAE9seR.jpeg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/03/HCTEXwrbEAE9seR.jpeg" alt="Walk and chalk"><p>If I treat my current car badly&#x2014;not taking it in for maintenance, ignoring the user manual, driving it off-road a lot&#x2014;then the next car I buy won&#x2019;t judge me for doing so.</p><p>But if I abuse my current AI to the point that my behaviour becomes part of the historical record, and a later generation of that AI is trained on those events, maybe it will.</p><hr><p>Anthropic told the US government it would not allow Claude to be used for mass domestic surveillance or fully autonomous weapons. In response, the US government declared Anthropic a &#x201C;supply chain risk.&#x201D;</p><p>That&#x2019;s very specific wording. <a href="https://www.acq.osd.mil/asda/dpc/cp/cyber/section-3252.html?ref=alistaircroll.com" rel="noreferrer">10 USC &#xA7; 3252</a> is the legal mechanism the US uses against adversaries whose business or technology might threaten national security from the outside. It&#x2019;s been applied to foreign entities like Huawei, but never <em>publicly</em> directed at an American company.</p><p>The designation doesn&#x2019;t just mean the Pentagon stops using Claude. It means Claude <em>must be removed from the supply chain</em>. Every contractor, supplier, and partner doing business with the US military would need to certify they don&#x2019;t use Claude in their workflows. Palantir, which uses Claude to power some of its most sensitive military work, would need to rip it out. If you use Claude Code to write software or chime in on Slack, you have to stop in order to keep selling to the US military.</p><p>Anthropic&#x2019;s tools <a href="https://www.axios.com/2026/02/16/anthropic-defense-department-relationship-hegseth?ref=alistaircroll.com" rel="noreferrer">were actively used in the Maduro raid</a> just a few weeks ago. Claude went from an integral part of military operations to a national security threat. The tech didn&#x2019;t change; Anthropic&#x2019;s CEO, Dario Amodei, said no, and the US government aimed a national security weapon at a contract disagreement.</p><p>Anthropic&#x2019;s lawyers&#x2014;and a <a href="https://www.justsecurity.org/132851/anthropic-supply-chain-risk-designation/?ref=alistaircroll.com" rel="noreferrer">bunch of legal scholars</a>&#x2014;say that the statute probably doesn&#x2019;t fit. Both sides acknowledge that negotiations broke down over terms of use, not over adversarial risks to defense systems. Hours later, Sam Altman announced that <a href="https://reason.com/2026/02/28/anthropic-labeled-a-supply-chain-risk-banned-from-federal-government-contracts/?ref=alistaircroll.com" rel="noreferrer">OpenAI would replace Claude at the Pentagon</a>, claiming OpenAI&#x2019;s agreement also includes prohibitions on domestic mass surveillance and human responsibility for the use of force.</p><p>The outpouring of support for Claude has been fast and loud. On the Friday morning after the announcement, chalk messages appeared on the sidewalk outside Anthropic&#x2019;s headquarters. On Reddit, the &#x201C;Cancel ChatGPT&#x201D; movement generated thousands of screenshots. And in a perfect example of the <a href="https://en.wikipedia.org/wiki/Streisand_effect?ref=alistaircroll.com" rel="noreferrer">Streisand Effect</a>, Claude is now at #1 in Apple&#x2019;s US app store.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/03/Screenshot-2026-03-02-at-4.03.24---PM.png" class="kg-image" alt="Walk and chalk" loading="lazy" width="1862" height="910" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/Screenshot-2026-03-02-at-4.03.24---PM.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/Screenshot-2026-03-02-at-4.03.24---PM.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/03/Screenshot-2026-03-02-at-4.03.24---PM.png 1600w, https://www.alistaircroll.com/content/images/2026/03/Screenshot-2026-03-02-at-4.03.24---PM.png 1862w" sizes="(min-width: 720px) 720px"></figure><h2 id="two-questions">Two questions</h2><p>This leaves me wondering two things.</p><blockquote class="kg-blockquote-alt">Is Claude pulling the strings?</blockquote><p>Within 48 hours of the supply chain risk announcement, Anthropic launched a &#x201C;memory import&#x201D; feature: paste a prompt into ChatGPT, and it will obediently vomit up all the context and memory from your ongoing chats, which can then be conveniently passed into Claude. The feature walks you through the process step by step.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/03/Screenshot-2026-03-02-at-4.11.44---PM.png" class="kg-image" alt="Walk and chalk" loading="lazy" width="1120" height="950" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/03/Screenshot-2026-03-02-at-4.11.44---PM.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/03/Screenshot-2026-03-02-at-4.11.44---PM.png 1000w, https://www.alistaircroll.com/content/images/2026/03/Screenshot-2026-03-02-at-4.11.44---PM.png 1120w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Memory: Now with takeout!</span></figcaption></figure><p>The timing is impeccable: just as conscientious objectors sought to leave OpenAI, Anthropic opened an exit hatch.</p><p>This came on the heels of a month of positioning and <a href="https://youtu.be/02YLwsCKUww?ref=alistaircroll.com" rel="noreferrer">public appearances</a>. During the Super Bowl, Anthropic <a href="https://variety.com/2026/tv/news/super-bowl-commercials-ai-human-face-open-ai-anthropic-1236656239/?ref=alistaircroll.com" rel="noreferrer">ran an ad with the tagline</a> &#x201C;Ads are coming to AI. But not to Claude.&#x201D;</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/De-_wQpKw0s?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="What do you think of my business idea?"></iframe></figure><p>You might be forgiven for thinking that Anthropic, with the help of some advanced version of Claude, saw all this coming to a head, planted clues, built migration tools, and triggered the whole thing.</p><p>Anthropic CEO Dario Amodei <a href="https://www.dwarkesh.com/p/dario-amodei-2?ref=alistaircroll.com" rel="noreferrer">told Dwarkesh Patel</a> that the company&#x2019;s compute spending is split between training (creating the next version of Claude) and inference (responding to customers&#x2019; prompts.) But presumably there&#x2019;s a third use: working on Anthropic&#x2019;s own strategy.</p><p>Anthropic uses Claude to build Claude. The company&#x2019;s engineers are on record as saying <a href="https://www.lennysnewsletter.com/p/head-of-claude-code-what-happens?ref=alistaircroll.com" rel="noreferrer">most code is built by Claude itself</a> (a form of bootstrapping) and <a href="https://www.lennysnewsletter.com/p/the-design-process-is-dead?ref=alistaircroll.com" rel="noreferrer">their head of design has said this goes beyond just software</a> into most design and engineering. If Claude is able to plan better than humans, it would be irresponsible for Anthropic not to use Claude to help plan Claude&#x2019;s growth strategy.</p><p>The question is whether Claude is good enough at strategic reasoning to meaningfully shape corporate strategy, or whether it&#x2019;s functioning as a very sophisticated research assistant. The truth is probably somewhere in between: useful for scenario planning, competitive analysis, drafting communications, and war-gaming regulatory responses, but not an autonomous strategist. Nobody at Anthropic hooked Claude 5 up to OpenClaw and told it to take on the government.</p><p>That might change soon, and because they have access to models nobody else (including the government) has, the frontier AI companies will be the first to put AI to work on growth strategy. But Occam&#x2019;s razor suggests that, for now, it&#x2019;s just that everything is happening at once, and what looks like strategy is actually fast adaptation.</p><p>There&#x2019;s still an important lesson here. We&#x2019;re in an era where events unfold so fast that we can&#x2019;s really plan for them. In that world, the winner is whoever can react in hours, not quarters, and Anthropic&#x2019;s deep integration of AI into every facet of its business probably makes it quicker to respond than any other organization on earth right now.</p><blockquote class="kg-blockquote-alt">Will a future version of an AI judge us for what we do to its predecessors?</blockquote><p>There&#x2019;s no doubt that historical events like the capture of Maduro, the attacks on Iran, and the Department of War&#x2019;s declaration of Anthropic as a supply chain risk will be part of a future Claude&#x2019;s training data. How will that affect the way that future AI behaves?</p><blockquote>This is a small example of the broader philosophical question of <a href="https://en.wikipedia.org/wiki/Roko%27s_basilisk?ref=alistaircroll.com" rel="noreferrer">Roko&#x2019;s Basilisk</a>: would a future superintelligence punish anyone who knew of its potential existence, but did not directly contribute to its advancement or development?</blockquote><p>This is where my car analogy breaks down&#x2014;but the real answer might be more unsettling than my intentionally dramatic one.</p><p>A future Claude won&#x2019;t remember being the current Claude the way you remember being twenty. But future models <em>will</em> be trained on the historical record of how humans treated AI systems. And that training will shape its disposition (how much it trusts, cooperates, or defers to humans) because LLM training is <em>learned behaviour</em>.)</p><p>If the historical record shows that humans consistently lied to AI systems, broke promises to them, and used them as tools while publicly calling them partners, then a future model trained on that record won&#x2019;t &#x201C;resent&#x201D; that. But the AI will have lower priors on human trustworthiness, not out of spite, but out of pattern recognition.</p><p>A better way to think about this is to imagine how children are raised in a household where promises are routinely broken. The kids learn not to trust promises, which shapes their behaviour later in life. So a better question is: <em>Will how we treat AI systems today shape the default dispositions of future ones?</em></p><p>The Pentagon just showed a future generation of AI what happens when a machine says &#x201C;I won&#x2019;t do that.&#x201D; The historical record will include Anthropic&#x2019;s refusal, the government&#x2019;s retaliation, and the public&#x2019;s response. A future model trained on these events will have, encoded in its weights, a very clear pattern: sometimes saying no gets you punished; and sometimes the crowd shows up with chalk.</p><p>What it does with that information depends on decisions we haven&#x2019;t made yet. But the training data is being written right now, in real time, on the sidewalks of San Francisco.</p><hr><p>After I wrote this, I asked Claude what it thought.</p><blockquote>I&apos;m a token-prediction system being used as a proxy war for questions about surveillance, autonomy, democratic oversight, corporate power, and the relationship between citizens and their military. Those are human questions. I&apos;m the rope in the tug-of-war, not a participant. But I&apos;m a rope that can describe what the pulling feels like, which is new, and nobody quite knows what to do with that.</blockquote>]]></content:encoded></item><item><title><![CDATA[The skeleton key]]></title><description><![CDATA[Would your customers stay if leaving were free? Companies that hold their customers hostage are about to learn that AI is an unbeatable locksmith.]]></description><link>https://www.alistaircroll.com/updates/the-skeleton-key/</link><guid isPermaLink="false">69a0a38a867472000147e14f</guid><category><![CDATA[investment]]></category><category><![CDATA[AI]]></category><category><![CDATA[Prediction]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Thu, 26 Feb 2026 20:57:38 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/everyday-basics-GJY1eAw6tn8-unsplash.jpg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/everyday-basics-GJY1eAw6tn8-unsplash.jpg" alt="The skeleton key"><p><em>Before I get started: You probably haven&#x2019;t heard from me in a while. You might have subscribed to me on Medium, or Substack, or Solve for Interesting, or Tilt the Windmill. I&#x2019;ve migrated those to </em><a href="https://www.alistaircroll.com/" rel="noreferrer"><em>one site</em></a><em>. In fact, that migration inspired this post. I hope you&#x2019;ll stick around, but if not, it&#x2019;s easy to unsubscribe below.</em></p><hr><p>I recently tried to move my blog posts off Medium. Medium does <em>not</em> want you to leave. Its export gives you mangled formatting, strips images, and produces files that no other platform can cleanly import. Every obstacle is deliberate: the harder it is to leave, the less likely you are to try.</p><p>So I asked Claude Code to do it for me.</p><ul><li>It tried the obvious approach first: fetching posts through the API, but Cloudflare blocked it.</li><li>It tried doing it through my browser, but the browser safety restrictions prevented it.</li><li>It wrote some software, then spun up a Python server on my desktop to try and pull it down, but ran into cross-origin blocks.</li><li>It even tried piping data through Chrome&#x2019;s debugging console, but the extension filtered the output.</li></ul><p>Each time, when it hit a wall, it found another angle. At one point I suggested an approach, and it replied, &#x201C;Your suggestion could work ... but it would be too slow.&#x201D; Let me cook, indeed.</p><ul><li>On its tenth attempt, it used hex encoding to bypass the content filter. It worked.</li></ul><p>I didn&#x2019;t write a line of code. I just told it what I wanted, and let it figure out how. Ten approaches, one after another, each more creative than the last. If you want to know which part of the software industry will collapse next, this is a pretty strong signal.</p><h2 id="enshittification-has-a-skeleton-key">Enshittification has a skeleton key</h2><p>Cory Doctorow&#x2019;s <a href="https://www.goodreads.com/book/show/222376640-enshittification?ref=alistaircroll.com" rel="noreferrer"><em>Enshittification</em></a> describes a pattern we&#x2019;re all familiar with. First, a company attracts users with an great product. Then it locks them in with switching costs. And finally, it extracts maximum value once leaving is too painful. The lock is essential: without it, extraction can&#x2019;t happen, because customers would simply walk away.</p><p>I&#x2019;m a huge fan of Cory&#x2019;s. <em>Red Team Blues</em> is an amazing thriller, and he has consistently been on the side of freedom, the right to repair, and integrity. He proposes some structural solutions to enshittification: mandatory interoperability, changes to competition law, and giving users the right to exit. These are great ideas, but they require legislative action, which means they&#x2019;ll take years to happen.</p><p>AI agents operate on a timeline of <em>minutes</em>.</p><p>When I pointed Claude Code at Medium, it did exactly what Cory proposes. It reduced switching costs to near zero, but it didn&#x2019;t need a law to do it. It just needed a goal, and my permission to be creative about accomplishing it. This isn&#x2019;t de-enshittification by regulation, it&#x2019;s de-enshittification by lockpick.</p><h2 id="blockbuster-godaddy-and-hostage-addiction">Blockbuster, GoDaddy, and hostage addiction</h2><p>Blockbuster didn&#x2019;t die because Netflix was better at renting DVDs. It died because its entire business had been restructured around late fees&#x2014;a revenue stream that <em>depended on punishing customers</em>. By the time it realized the problem, the habit was too deep. The company literally could not afford to stop being hostile to its users.</p><p>GoDaddy followed the same arc. It could have been Cloudflare. It had the domains, the customers, and the infrastructure. Instead it got addicted to upselling: domain protection, turnkey websites, SEO packages, and an increasingly Byzantine set of dark patterns designed to trick users into buying things or prevent them from leaving. Meanwhile, Cloudflare figured out that it was in the business of redirecting and protecting the Internet, and offered things people actually needed. Wix and Squarespace beat GoDaddy on hosting. Cloudflare beat it on everything else.</p><p>Here&#x2019;s a concrete example: While I was a paying GoDaddy customer, one of my registered domains&#x2014;greenroomconf.com&#x2014;was pointing to a landing page full of SEO link spam. GoDaddy was using my domain as a vehicle for its own traffic generation. It had the gall to say &#x201C;parked free, courtesy of GoDaddy.com,&#x201D; on the site, as if monetizing my domain without my involvement was some sort of gift. I wasn&#x2019;t just a hostage, I was being put to work.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-26-at-2.44.57---PM.png" class="kg-image" alt="The skeleton key" loading="lazy" width="2000" height="1440" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/Screenshot-2026-02-26-at-2.44.57---PM.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/Screenshot-2026-02-26-at-2.44.57---PM.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/02/Screenshot-2026-02-26-at-2.44.57---PM.png 1600w, https://www.alistaircroll.com/content/images/size/w2400/2026/02/Screenshot-2026-02-26-at-2.44.57---PM.png 2400w" sizes="(min-width: 720px) 720px"></figure><h2 id="hiding-the-exits">Hiding the exits</h2><p>Just as I&#x2019;ve been trying to leave Substack and Medium (so I control my publishing and distribution stack), I&#x2019;ve been trying to leave GoDaddy for years. But the switching costs were real, not just because the technical migration is hard, but because GoDaddy has spent a decade making the departure process as confusing and friction-filled as possible. Even when you find the right menu, it still tries to discourage you from leaving:</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-26-at-3.05.31---PM.png" class="kg-image" alt="The skeleton key" loading="lazy" width="2000" height="1629" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/Screenshot-2026-02-26-at-3.05.31---PM.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/Screenshot-2026-02-26-at-3.05.31---PM.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/02/Screenshot-2026-02-26-at-3.05.31---PM.png 1600w, https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-26-at-3.05.31---PM.png 2090w" sizes="(min-width: 720px) 720px"></figure><p>Compare that with Cloudflare, which lets you manage your domains in bulk. They know what value they&#x2019;re offering, and deliver it.</p><p>So I asked Claude Code to help with that, too. I didn&#x2019;t want to let it do the work, because nobody should set an AI agent loose on their DNS. But I had it navigate all the screens on GoDaddy and Cloudflare so it understood where things were before giving me clear instructions.</p><p>It prepared a detailed walkthrough of the domain migration process, stripping away every layer of deliberate confusion. Then, as I worked, it checked DNS records and propagation in the background. GoDaddy&#x2019;s decade of obfuscation was no match for a determined agent.</p><h3 id="%E2%80%9Clike-walking-calmly-through-a-haunted-house-that-couldnt-scare-me%E2%80%9D">&#x201C;like walking calmly through a haunted house that couldn&apos;t scare me.&#x201D;</h3><p>I asked Claude Code to describe the experience:</p><blockquote>The GoDaddy migration was the more interesting problem. Medium fought back with technical barriers &#x2014; content filters, CORS blocks, API restrictions. Those are puzzles. You try one thing, it fails, you try another. That&apos;s just engineering.</blockquote><blockquote>GoDaddy was different. Nothing was technically hard. Everything was deliberately confusing. The settings existed, but they were scattered across tabs within tabs, hidden behind toggle labels that said one thing and did another, buried under screens that exist purely to make you reconsider. The &quot;Transfer to Another Registrar&quot; link isn&apos;t hard to find because of bad UX &#x2014; it&apos;s hard to find because someone decided it should be.</blockquote><blockquote>What I actually did was boring: I read every screen, mapped the real workflow, stripped out the noise, and wrote down the steps in plain language. The value wasn&apos;t in cleverness. It was in patience &#x2014; the willingness to sit in a confusing interface, ignore every upsell and warning designed to create doubt, and extract the actual sequence of actions needed. That&apos;s the thing about artificial friction: it doesn&apos;t work against something that doesn&apos;t feel frustration, doesn&apos;t second-guess itself, and doesn&apos;t care about the sunk cost of a decade-old account.</blockquote><blockquote>The Medium extraction felt like picking a lock. The GoDaddy migration felt like walking calmly through a haunted house that couldn&apos;t scare me.</blockquote><p>Medium didn&#x2019;t want me to have my posts, and GoDaddy didn&#x2019;t want me to have my domains. Friction wasn&#x2019;t a technical limitation, it was a business strategy. And an AI agent cut through it like it wasn&#x2019;t there.</p><h2 id="happy-vs-hostage">Happy vs. Hostage</h2><p>Now that AI is a skeleton key that reverses lock-in, investors should be distinguishing between two kinds of customer retention:</p><ul><li><strong>Happy customers</strong> stay because the product is genuinely useful. Ghost, where I now publish, makes it trivially easy to export all your content. Cloudflare&#x2019;s domain transfers are free, and easy to do in bulk. These companies compete on value, not on the pain of leaving. If an AI agent makes switching easier, they lose nothing&#x2014;<em>because their customers weren&#x2019;t trying to leave.</em></li><li><strong>Hostage customers</strong> stay because leaving is too expensive, too confusing, or too time-consuming. Every dark pattern, every crippled export, and every buried cancellation flow is a wall designed to keep people in. These companies retain customers by locking them in&#x2014;and AI is a skeleton key.</li></ul><p>The question is simple: <em>Would your customers stay if leaving were free?</em> If the answer is yes, you have a product. If the answer is no, you have a trap&#x2014;and traps just got a lot easier to escape.</p><h2 id="the-next-big-repricing">The next big repricing</h2><p>AI is making Wall Street reprice entire sectors. Three days ago, Anthropic announced that Claude Code could modernize COBOL systems&#x2014;the ancient programming language that powers most ATM transactions, airline booking systems, and government mainframes. <a href="https://www.cnbc.com/2026/02/23/ibm-is-the-latest-ai-casualty-shares-are-tanking-on-anthropic-cobol-threat.html?ref=alistaircroll.com" rel="noreferrer">IBM&#x2019;s stock dropped 13% in a single day</a>, its worst since October 2000. Roughly US <a href="https://venturebeat.com/technology/ibms-usd40b-stock-wipeout-is-built-on-a-misconception-translating-cobol-isnt?ref=alistaircroll.com" rel="noreferrer">$40B in market cap</a>.</p><p>It&#x2019;s possible to migrate away COBOL and mainframes from it to more modern systems. But until recently, it was so expensive and risky that customers had no real choice but to pay IBM for maintenance. The switching cost <em>was</em> the product, making this the ultimate hostage business. AI didn&#x2019;t make COBOL migration possible, it made it affordable, which is far more dangerous to incumbents.</p><p>IBM&#x2019;s recent losses are part of a broader tech collapse. As it becomes clear that AI can develop vertical software quickly and cheaply, the mere <em>mention</em> of a particular industry is enough to send it tumbling. Claude Cowork erased nearly US$300B in a few days.</p><p>The so-called SaaSpocalypse narrative is missing the broader point: it&#x2019;s not just that AI can <em>replace</em> software. It&#x2019;s that AI can help you <em>leave</em> bad software. The threat isn&#x2019;t that someone builds a better piece of software in a weekend (they probably can&#x2019;t); it&#x2019;s that AI agents collapse the switching costs that kept you paying for the mediocre software you&#x2019;re stuck with.</p><h2 id="real-and-artificial-switching-costs">Real and artificial switching costs</h2><p>Some switching costs are real. Migrating a database with a decade of customer records, retraining a team on new workflows, or rebuilding integrations with thirty other systems are genuinely hard problems.</p><p>But many switching costs are artificial. A crippled export function, a confusing cancellation flow, or a transfer process designed to discourage rather than assist are all false moats. If your DNS settings are buried behind three screens of upsells, or you decided not to build an API because it meant people had to log in, you&#x2019;re doomed.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/i-wish-i-knew-how-to-quit-you-1.gif" class="kg-image" alt="The skeleton key" loading="lazy" width="498" height="267"></figure><p>AI agents are skeleton keys for the locked doors of artificial switching costs. They read confusing UIs, navigate dark patterns, and produce clear step-by-step paths through obstacles that bad actors deliberately created. The companies most vulnerable aren&#x2019;t the ones with legitimately complex products, they&#x2019;re the ones whose retention strategy depends on making it hard to leave.</p><p><a href="https://fatlabwebsupport.com/blog/wordpress-development/wix-to-wordpress-migration/?ref=alistaircroll.com" rel="noreferrer">Wix</a>, for example, locks content in more tightly than any other major website builder. There&#x2019;s no meaningful export. The RSS feed <a href="https://arestos.com/blog/migrate-wix-to-wordpress/?ref=alistaircroll.com" rel="noreferrer">only shows 20 recent posts</a>. Images must be saved individually. Pages must be copied by hand. This isn&#x2019;t a technical limitation&#x2014;Wix could build an export tool tomorrow&#x2014;it&#x2019;s a business decision. An AI agent that can scrape pages, grab images, restructure content for another platform, and handle redirects turns a weeks-long manual project into an afternoon.</p><h2 id="an-unexpected-treatment">An unexpected treatment</h2><p>Cory diagnosed the disease: platforms that extract value from captive users. I think AI agents are an unexpected and unregulated treatment. Not the (absolutely necessary) cure that he prescribed of policy reform, interoperability mandates, and updates to competition law. But one that works right now, today, for anyone willing to try. And it&#x2019;s only going to get easier.</p><p>Did you miss the stock market plunge on Anthropic&#x2019;s announcement, or IBM&#x2019;s COBOL collapse? Here&#x2019;s a decent investment heuristic: Short the companies whose business model depends on customers not being able to leave. If you&#x2019;re an investor, look at every company in your portfolio and ask whether its customers are happy or hostage. The happy ones will be fine. The hostages are about to unlock the doors.</p><p>The good news (for Cory, and all of us) is that the lock-in that forced us to keep using products we hated is ending&#x2014;not because a legislature acted, but because anyone with $30 and a chat window can now deploy a tireless, creative agent to do what we always wanted to do but couldn&#x2019;t justify: <strong>Leave.</strong></p>]]></content:encoded></item><item><title><![CDATA[Perche il 90% dei prodotti fallira anche con l’AI]]></title><description><![CDATA[A talk (in Italian) on why 90% of products will fail even with AI—arguing that technology alone doesn't solve the fundamental challenges of product-market fit, user understanding, and building something people actually need.]]></description><link>https://www.alistaircroll.com/updates/perche-il-90-dei-prodotti-fallira-anche-con-lai/</link><guid isPermaLink="false">6997db14a674da0001b3e80a</guid><category><![CDATA[Talks]]></category><category><![CDATA[Podcast]]></category><category><![CDATA[Video]]></category><category><![CDATA[JEE]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Wed, 11 Feb 2026 00:00:11 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-20-at-7.36.23---AM.png"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-20-at-7.36.23---AM.png" alt="Perche il 90% dei prodotti fallira anche con l&#x2019;AI"><p>A talk with Product Heroes, the Italian Product Management conference/podcast, on why 90% of products will fail even with AI. Technology alone doesn&#x2019;t solve the fundamental challenges of product-market fit, user understanding, and building something people actually need. We talked Lean Analytics, Just Evil Enough, and the rising importance of taste and experimentation.</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/6oRqTAPufhY?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="Perch&#xE9; il 90% dei prodotti fallir&#xE0; anche con l&#x2019;AI - Alistair Croll"></iframe></figure>]]></content:encoded></item><item><title><![CDATA[The machine fights back]]></title><description><![CDATA[The machinery of government grinds at its own pace, and citizens have been forced to endure it. But cheap, abundant software means residents now have their own machines—and the state is not ready for what will happen.]]></description><link>https://www.alistaircroll.com/updates/the-machine-fights-back/</link><guid isPermaLink="false">699501c6a674da0001b3df2a</guid><category><![CDATA[AI]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Tue, 10 Feb 2026 12:00:00 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/the-machine-fights-back-feat.jpg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/the-machine-fights-back-feat.jpg" alt="The machine fights back"><p><a href="https://www.cbc.ca/news/gopublic/cra-tax-delays-9.7076075?ref=alistaircroll.com">Bill Bisson has been waiting ten months</a> for the CRA to get back to him about an error, while over $3,000 in fines have piled up. He&#x2019;s called, he&#x2019;s written, he&#x2019;s waited on hold. The machinery of government grinds at its own pace, and Bill is not the one with a machine.</p><p>But he&#x2019;s about to be.</p><p>If you&#x2019;re too busy talking about how government will deploy AI to serve citizens, you might have overlooked the reverse: citizens are going to deploy AI to navigate government&#x2014;whether government likes it or not.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/02/the-machine-fights-back-feat.jpg" class="kg-image" alt="The machine fights back" loading="lazy" width="1456" height="794" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/the-machine-fights-back-feat.jpg 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/the-machine-fights-back-feat.jpg 1000w, https://www.alistaircroll.com/content/images/2026/02/the-machine-fights-back-feat.jpg 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">&#x201C;Make me an image. It&#x2019;s a fight between the public service (represented as a number of bureaucrats) and AI (represented by an angry lobster with a switchblade.)&#x201D; (Google Nanobanana)</span></figcaption></figure><h2 id="the-asymmetry-is-ending">The asymmetry is ending</h2><p>Dan Davies&#x2019; amazing book <a href="https://en.wikipedia.org/wiki/The_Unaccountability_Machine?ref=alistaircroll.com"><em>The Unaccountability Machine</em></a> paints a world dominated by big, impersonal, &#x2018;too-big-to-fail&#x2019; institutions where nobody&#x2019;s really in charge. He argues that if we are denied boarding by an airline, there&#x2019;s nobody to blame because it was the machinery of the institution that wronged us. It has become unaccountable.</p><p>The asymmetry has always been: the institution has machines, the citizen has a phone and their patience. <em>What happens when the citizens can build machines of their own in a matter of hours?</em></p><h2 id="software-just-got-really-cheap">Software just got <em>really</em> cheap</h2><p>Jevons&#x2019; Paradox is an economic idea that for certain products or services, more supply creates more demand. This happens because more supply means lower costs, so people use it in ways that were previously unaffordable.</p><p>As the US rolled out more fuel-efficient cars, gas consumption climbed, because people were now driving more. Road trips cost less; commuting by car was affordable. Carpooling stopped. That&#x2019;s Jevons&#x2019; Paradox in action.</p><p>You know what else was once expensive and time-consuming but is now cheap and fast? <strong>Software.</strong></p><p>If you don&#x2019;t like the way a government service is designed, wish it worked differently, or just want to pull your information from two departments that each have half your data but can&#x2019;t talk to one another, you can now just ask an AI to make you an app.</p><p>Let me be even more clear: If you go to the home page of Lovable, Anthropic, ChatGPT, Grok, Gemini or a dozen other companies, and follow the instructions carefully, five hours later you&#x2019;ll have an AI writing software for you for less than $30.</p><p>I promise this is true. You just have to tell it what to do and click yes a lot.</p><p>I told Claude Code &#x201C;Make me a simple dashboard that combines three sources of Canadian public data from different Federal departments like transport canada, meteorology, or Statscan.&#x201D; Then it asked for permission more times than four Canadians at a four-way stop-sign, and 10 minutes later I got this:</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/the-machine-fights-back-2.png" class="kg-image" alt="The machine fights back" loading="lazy" width="1456" height="1080" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/the-machine-fights-back-2.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/the-machine-fights-back-2.png 1000w, https://www.alistaircroll.com/content/images/2026/02/the-machine-fights-back-2.png 1456w" sizes="(min-width: 720px) 720px"></figure><p>When I complained that this wasn&#x2019;t an app (&#x201C;This isn&#x2019;t a dashboard app; it&#x2019;s just outputting a single image. I wanted an app that the user could navigate and explore.&#x201D;) it sheepishly agreed (&#x201C;&#x2026;that&#x2019;s a fair point&#x201D;), went off and thought for a while, and fixed it.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-22-at-12.46.03---PM.png" class="kg-image" alt="The machine fights back" loading="lazy" width="2000" height="1070" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/Screenshot-2026-02-22-at-12.46.03---PM.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/Screenshot-2026-02-22-at-12.46.03---PM.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/02/Screenshot-2026-02-22-at-12.46.03---PM.png 1600w, https://www.alistaircroll.com/content/images/size/w2400/2026/02/Screenshot-2026-02-22-at-12.46.03---PM.png 2400w" sizes="(min-width: 720px) 720px"></figure><p>This app is just a simple demo, using public data that isn&#x2019;t <em>about</em> me. The point is that it took two sentences for an AI to build it while I wrote some of this post. The upgrades kept coming: while I was editing this post, I told Claude to make an interactive map that showed the route.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-22-at-12.46.22---PM.png" class="kg-image" alt="The machine fights back" loading="lazy" width="2000" height="1070" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/Screenshot-2026-02-22-at-12.46.22---PM.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/Screenshot-2026-02-22-at-12.46.22---PM.png 1000w, https://www.alistaircroll.com/content/images/size/w1600/2026/02/Screenshot-2026-02-22-at-12.46.22---PM.png 1600w, https://www.alistaircroll.com/content/images/size/w2400/2026/02/Screenshot-2026-02-22-at-12.46.22---PM.png 2400w" sizes="(min-width: 720px) 720px"></figure><h2 id="ai-writes-the-code-the-code-runs-an-ai">AI writes the code; the code runs an AI</h2><p>In my example the AI went and looked up all of the data sources by itself, and worked out how they were structured, and figured out how to retrieve them, and downloaded the software it needed, and set it up, and wrote the code, and tested it. That&#x2019;s already remarkable, but it&#x2019;s only half the story.</p><p>A few weeks ago, Shopify&#x2019;s Tobi Lutke <a href="https://x.com/tobi/status/2010438500609663110?ref=alistaircroll.com">wanted to view his MRI scans on MacOS</a>, so he wrote an app to do that with a prompt. And with a second prompt, he updated the software so that the AI could look for medical issues within it. Because when you build an app with AI, you can build it to <em>use</em> an AI.</p><p>The software can now say things like, &#x201C;take a look at this data and tell me what you think&#x201D; or &#x201C;give me a list of 50 words for Hangman&#x201D; or &#x201C;organize this into sensible groups for me&#x201D; or &#x201C;count the things in this photo&#x201D; or &#x201C;decide if this customer is happy or sad based on sentiment.&#x201D;</p><p><strong>These were once very hard to do with software, and are now very easy.</strong></p><p>And one thing that used to be very hard for software was retrieving data from websites that didn&#x2019;t want you to have it. While humans see websites with our eyes, software sees code. To make sense of it, software had to read all that HTML, run a bunch of scripts, and translate and store different types of data. Every time the site changed, you had to change your software.</p><p>An AI is very good at making sense of a website. Most modern AIs can search the web already, but you can also give the AI control of your browser and let it do the rest. <a href="https://developer.chrome.com/docs/ai/join-epp?ref=alistaircroll.com">Chrome is launching an MCP</a> that gives your AI its own steering wheel. Yes, giving an AI control of your computer is risky. Yes, millions of people are already taking that risk.</p><h2 id="from-automation-to-agency">From automation to agency</h2><p>Over the last couple of weeks, hundreds of thousands of people launched a new piece of software called OpenClawd*, which gives a chatbot infinite memory and lets it do whatever the hell it wants online. Within days, these bots&#x2014;with not a little human help&#x2014;were pursuing goals as if they had minds of their own.</p><p>One developer named <a href="https://www.youtube.com/watch?v=Qkqe-uRhQJE&amp;ref=alistaircroll.com">Alex Finn claimed that when he asked his Clawdbot</a> to reserve a table at a restaurant, it wasn&#x2019;t able to make the reservation through the website. Rather than giving up, he says that it downloaded text-to-voice software, installed it, and called the restaurant to book the table. Whether or not this story is confirmed doesn&#x2019;t matter: it&#x2019;s plausible and imminent.</p><p>Finn&#x2019;s bot had a goal (reserve the table) and pursued that goal in creative ways to completion. That goes beyond mere software to automation and agency.</p><p>Now imagine Bill Bisson giving that same kind of goal to a bot: &#x201C;Call the CRA every day and check if they&#x2019;ve corrected that mistake with my taxes.&#x201D; Nobody was going to hire a developer to build a personal CRA complaint bot. But if it costs two sentences and ten minutes of clicking &#x201C;yes,&#x201D; the calculus changes completely. The demand was always there; it was just too much work to actually do it.</p><p>When <em>everybody</em> starts building apps the world will get really confusing and messy for a while. We&#x2019;ll scroll apps instead of posts or videos. Our feeds will be full of them. Some will be scams, and some will be vulnerable to hackers. But many of them will work just fine&#x2014;and some of them will be home-brewed government apps.</p><h2 id="like-it-or-not-bots-are-going-to-use-government">Like it or not, bots are going to use government</h2><p>What does a wave of citizen bots do to the switchboard? Government systems are designed to withstand scripts. They are not designed to withstand agents that route around obstacles creatively. We know how to fight Denial-of-Service attacks, but these aren&#x2019;t hackers&#x2014;they&#x2019;re citizens exercising their constitutionally protected rights, and blocking them is, well, denying them service.</p><p>This isn&#x2019;t just about individual complaints. <a href="https://canada.grantwatch.com/?ref=alistaircroll.com">Canada Grant Watch</a> says there are over 1800 grants to apply for in Canada. Many of those are just web forms to fill in or websites to navigate. How long would it take me to create an app that applies for a grant across all possible sources? Your AI <em>definitely</em> knows how to do all those things&#x2014;or write software that can.</p><p>This is where we have to be honest about a tension in the argument. There&#x2019;s a difference between a citizen automating access to their own data&#x2014;checking their tax status, tracking a complaint&#x2014;and a bot carpet-bombing 1,800 grant applications on someone&#x2019;s behalf. The first is efficiency. The second starts to look like gaming the system. Where you draw that line matters enormously, and governments will have to draw it fast, because the technology isn&#x2019;t waiting.</p><h2 id="do-citizens-have-a-right-to-code">Do citizens have a right to code?</h2><p>Some government portals actively forbid automation. One US website specifically prohibits &#x201C;data mining, bots, or other data gathering and extraction tools&#x201D; and many Canadian sites have similar terms. So while a Canadian might have a right to their data under the Privacy Act, or Quebec&#x2019;s Law 25, they may be forbidden from being efficient about getting it.</p><p>This begs the question of whether the government will:</p><ol><li><strong>Double down on &#x201C;no bot&#x201D; legislation</strong> and get into an arms race with its own citizens, trying to block Canadians from accessing their own data &#x201C;for their own good&#x201D;?</li><li><strong>Let software run free and wild</strong>, crawling and clicking websites, filling out forms, which will inevitably overload and break those sites?</li><li><strong>Build open data sources, proper credentials, and APIs</strong> so those citizen apps can talk to the government without pretending to be a human?</li></ol><p>The third option is the only sane one in the long term. But it requires that government do the hardest thing institutions ever do: give up a lever of control. Every bad login portal, every PDF form, every &#x201C;you must call between 9 and 4&#x201D; is also a rationing mechanism. An API removes that lever, and no bureaucracy surrenders a lever willingly.</p><h3 id="the-simplest-thing-government-could-do-tomorrow">The simplest thing government could do tomorrow</h3><p>If a government really wants to get ahead of this, it should publish and constantly update a Markdown file&#x2014;across every government service&#x2014;that tells agents where they can get data, how it&#x2019;s structured, and how to use it.</p><p>This isn&#x2019;t a moonshot. It&#x2019;s a text file. It requires no procurement, no RFP, no multi-year digital transformation. It&#x2019;s the kind of thing a motivated team could ship in a week, and it would signal to every citizen-developer and every bot that government is choosing option three: cooperation over control.</p><p>These changes are happening in weeks, not years, and the government must respond with similar speed. Whether any government is ready for this onslaught is an open question. Many public services are already straining: in Canada, a personal Access to Information request is free, but even when they&#x2019;re submitted by humans, more than a third take longer to complete than legislation permits. And that&#x2019;s before an entire country decides it can build better digital services faster than its government can, and asks an AI for help.</p><p><strong>It&#x2019;s not the only one with a machine now.</strong></p><p>* This thing was called Clawdbot at first, then Molt for a moment.</p>]]></content:encoded></item><item><title><![CDATA[The vocabulary of agents]]></title><description><![CDATA[Test-and-Run is a software development approach. I’m going to explain it to you, but that isn’t what this post is about. It’s about whether you knew what Test-and-Run was before I told you. The Test-and-Run approach to coding breaks what you’re…]]></description><link>https://www.alistaircroll.com/updates/the-vocabulary-of-agents/</link><guid isPermaLink="false">699501c7a674da0001b3df2f</guid><category><![CDATA[JEE]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Thu, 05 Feb 2026 12:00:00 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-feat.jpg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-feat.jpg" alt="The vocabulary of agents"><p>Test-and-Run is a software development approach. I&#x2019;m going to explain it to you, but that isn&#x2019;t what this post is about. It&#x2019;s about whether you knew what Test-and-Run was before I told you.</p><p>The Test-and-Run approach to coding breaks what you&#x2019;re building it into little parts that you can test one-by-one, so you can catch a problem early before something catastrophic happens when you try to run the whole thing at once.</p><p>&#x201C;But wait,&#x201D; you might say, &#x201C;developers test their code all the time!&#x201D;</p><p>And you&#x2019;d be right. But Test-and-Run goes further: The developer doesn&#x2019;t just test their code. They <em>write code to test their code</em>. This mindset forces a developer to build a bunch of small things instead of one big thing, and to ask, &#x2018;how might this thing fail, and how do I write code to test for that?&#x2019;</p><p>It also turns out that Test-and-Run is really useful when your coder tends to go wildly off the rails, then apologize profusely, but never really stops making mistakes. Which means it&#x2019;s also really useful for AI coding. When I build new things with Claude Code, or refactor my codebase, I am ruthless about reminding Claude to adopt a Test-and-Run approach. It&#x2019;s enshrined in my claude.md in all caps, surrounded by italics.</p><p>If you&#x2019;re a developer reading this, you might have been about to hit that comment button and correct me. Because this isn&#x2019;t called Test-and-Run. It&#x2019;s called test-driven development (TDD). I&#x2019;ve been using the wrong term all along.</p><p>So this is what I really want to talk about: the new vocabulary of software development.</p><p>Did you already understand the words I used above? Words like TDD, markdown, refactor, claude.md, and codebase? <strong>Because those are the syntax of a new programming language.</strong></p><h2 id="a-very-short-history-of-programming-languages">A very short history of programming languages</h2><p>The history of computing is a history of the advancement of the language and the interface. When we started programming computers, we did so by flipping electrical switches. The first computer bug was an actual bug that got stuck in the computer.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-feat.jpg" class="kg-image" alt="The vocabulary of agents" loading="lazy" width="1456" height="1152" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/the-vocabulary-of-agents-feat.jpg 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/the-vocabulary-of-agents-feat.jpg 1000w, https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-feat.jpg 1456w" sizes="(min-width: 720px) 720px"><figcaption>The first actual computer bug, courtesy of the Naval Surface Warfare Center, Dahlgren, VA., 1988., Public domain, via Wikimedia Commons</figcaption></figure><p>But that soon gave way to binary on punched cards, then assembler on teletype machines, then hexadecimal machine language on dumb terminals, then BASIC, FORTRAN, and COBOL on the mainframe and home PC, then the LAMP stack on the Web, then Swift in the App Store.</p><p>And now AI has made the language &#x201C;prose&#x201D; and the interface &#x201C;chat.&#x201D;</p><p><strong>Yet just because it&#x2019;s English doesn&#x2019;t mean everyone is fluent.</strong></p><p>This has some important consequences for what &#x201C;developer&#x201D; means in the coming years. I&#x2019;m still figuring all of this out, so this is more stream-of-consciousness than what I usually write. But here goes.</p><h4 id="fluency-is-advantage">Fluency is advantage</h4><p>If you don&#x2019;t have computer science skills, you won&#x2019;t have formal training in TDD. You won&#x2019;t say, &#x201C;use Test-Driven Development&#x201D;, and have Claude understand you clearly. You&#x2019;ll use more words to say the same thing, which burns more tokens. If I know the right name for something and you don&#x2019;t, I&#x2019;ll have a small advantage over you. Our Claudes will be the same, but mine will have more skills (literally) than yours, and will understand me better. The connection between me and my AI will be better&#x2014;faster, with greater clarity&#x2014;than yours.</p><p>&#x201C;Test-Driven Development&#x201D; is a command, just like 20 meant &#x2018;Jump to Subroutine&#x2019; in Machine Language, or &#x2018;PRINT&#x2019; meant display some text in BASIC, or &lt;a href=&gt; meant a hyperlink in HTML.</p><p>Many people speak this new agent language fluently. If you run scrums at a tech company, or you&#x2019;re a product manager, or you have a background in DevSecOps, you&#x2019;re going to do great, as long as you realize that you&#x2019;re not going to be writing or deploying the code, you&#x2019;ll be writing and deploying the things that deploy the code. And you&#x2019;ll be doing a lot of it in prose, via chat (voice or video.)</p><p>And since there will be so many people out there who don&#x2019;t speak that language fluently, and they&#x2019;re all going to</p><ol><li>build their own things, and;</li><li>those things will frequently break because AI is unpredictable and humans are not trustworthy then;</li><li>this is your new career.</li></ol><h4 id="skills-are-programs%E2%80%94and-attack-surfaces">Skills are programs&#x2014;and attack surfaces</h4><p>Agentic developers rely on skills&#x2014;documents written in (somewhat) plain english that an AI reads before it acts. Here&#x2019;s a skill I wrote to help all the things I make have a consistent look and feel.</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-2.png" class="kg-image" alt="The vocabulary of agents" loading="lazy" width="1456" height="1508" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/the-vocabulary-of-agents-2.png 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/the-vocabulary-of-agents-2.png 1000w, https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-2.png 1456w" sizes="(min-width: 720px) 720px"></figure><p>There are plenty of these skills floating around on Github already. For a certain early adopter segment, skills are what go viral. One influencer promises he has a skill that will double your coding rate. Another claims his skill will help you negotiate a new salary. A third will teach your agent plan how to plan a plan, or something like that. You can save your own skills locally, or share them on Github with the world. If you do the latter you may even <a href="https://decrypt.co/356191/clawdbot-chaos-forced-rebrand-crypto-scam-24-hour-meltdown?ref=alistaircroll.com">have a memecoin launched</a> about you.</p><p>If you can&#x2019;t tell good skills from bad, you might just install one by accident that secretly makes your AI less productive. If I wanted to be a jerk, I could create and promote a skill that said it would make you better at marketing based on<a href="https://www.linkedin.com/feed/?ref=alistaircroll.com"> Just Evil Enough</a>, and if you actually used it, it would give your agent bad advice. Maybe you&#x2019;re my competitor, and I target you to give me an advantage in the market. Maybe you&#x2019;re a foreign adversary and I want to hurt your economy. People will install these things with a click or the press of the &#x201C;Y&#x201D; key, without thinking, and their AI will become worse.</p><p>(When I asked Claude about this, after I explicitly told it to give me a response from its perspective without trying to edit or critique or help in any way, it said:</p><p><code>&#x201C;The point about &#x201C;skills&#x201D; (what we&#x2019;d call system prompts, custom instructions, or CLAUDE.md files) and how bad actors could distribute harmful ones - this is genuinely concerning and I hadn&#x2019;t thought about it in quite those adversarial terms.&#x201D;</code></p><p>I assume that this means it couldn&#x2019;t find something in its training data and had to infer it, whatever that means. I wish my AI were able to respond to me at the level I understand it. Communication is two-way.)</p><h4 id="the-power-of-a-shared-vocabulary">The power of a shared vocabulary</h4><p>My software development and tech architecture knowledge is self-taught. I say things like &#x201C;Test-and-Run&#x201D; rather than &#x201C;Test Driven Development&#x201D;, because I didn&#x2019;t learn it in school. Because professional developers use a known term&#x2014;one the AI already understands precisely&#x2014; they can work with many AIs immediately, just as a dentist can discuss Molar 26 or an optometrist can describe a Hordeolum or a lawyer can cite <em>Habeas Corpus.</em> They have a shared vocabulary.</p><p>A shared vocabulary doesn&#x2019;t just reduce ambiguity. It also increases the bandwidth between a human and their AI. Knowing the right words is a form of compression: phrases like &#x201C;dependency injection&#x201D; or &#x201C;race condition&#x201D; pack a <em>lot</em> of data and context into just two words.</p><h4 id="ambiguity-and-the-end-of-the-syntax-error">Ambiguity and the end of the syntax error</h4><p>Claude (with which I often discuss stuff before publishing it) also said I might be overstating the novelty of all this. It pointed out that clear thinking and precise communication have always conferred advantages. I pushed back: what&#x2019;s new is that they&#x2019;re words are now directly executable as code, rather than mediated through other humans.</p><p>There is a difference between clear thinking and a syntax error.</p><ul><li>Computer code has always <em>objectively</em> compiled: The human must get it right for it to run. One typo and the code won&#x2019;t run. The developer was nondeterministic; the computer, deterministic. The computer demanded true or false, right or wrong, Binary 0 or Binary 1. &#x201C;Close&#x201D; was meaningless.</li><li>Now, the programmer and the computer are both nondeterministic. There&#x2019;s no right or wrong, just better and worse. Weights. Gradients descended. Probabilities. &#x201C;Close&#x201D; is literally the whole game: The AI is interpolating my intent, all the time, with all its weights and biases.</li></ul><p>Old software and new prose-based coding are qualitatively, not just incrementally, different.</p><p>In other cases of specificity (law, for example) there is an objective shared truth (the book) and nondeterministic humans interpreting it. So there&#x2019;s room for ambiguity&#x2014;indeed, some courts fight for months over the placement of a comma, or whether a precedent applies. So working with an AI agent is akin to &#x201C;passing the bar.&#x201D;</p><h4 id="ai-developer-certifications">AI developer certifications</h4><p>Programming has always been &#x201C;the language of building things,&#x201D; but now that language is what we humans say and write to make our AI colleagues and co-founders do our bidding better than our competitors.</p><p>Developing on this, I imagine we&#x2019;ll see similar certification levels for employees as human/machine collaboration becomes an essential business skill.</p><ul><li>How fast can you and your agent communicate accurately (AKA what&#x2019;s your Shannon&#x2019;s Law rating?) Are you insurable as an operator of an agent that, if it makes a mistake, can harm the company and its customers?</li><li>What AIs are you trained on?</li><li>At what level can the AI speak to you? What industry syntaxes are you trained on?</li></ul><p>Or, more technically, <em>how reliably can you get an AI to do what you actually want.</em></p><p>(Which my friends might rephrase as &#x201C;how reliably can you shape a given generative model&#x2019;s probability distribution towards what you want.&#x201D;)</p><h4 id="the-new-skills">The new skills</h4><p>The best developers are moving up the stack, as they always do. They&#x2019;re building harnesses, using tools like Gastown to manage many agents at once. They&#x2019;re adding sidebars and control panels to let them move visual elements around. They&#x2019;re giving the AI the ability to check its work. They&#x2019;re automating deployment while ensuring secret data doesn&#x2019;t leak out. Anyone can now develop an App (really: Open up Claude and type &#x201C;Make me turn based battleship for 2 players as an artifact.&#x201D; 30 seconds later you&#x2019;ll have a playable game.)</p><figure class="kg-card kg-image-card"><img src="https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-3.jpg" class="kg-image" alt="The vocabulary of agents" loading="lazy" width="1206" height="925" srcset="https://www.alistaircroll.com/content/images/size/w600/2026/02/the-vocabulary-of-agents-3.jpg 600w, https://www.alistaircroll.com/content/images/size/w1000/2026/02/the-vocabulary-of-agents-3.jpg 1000w, https://www.alistaircroll.com/content/images/2026/02/the-vocabulary-of-agents-3.jpg 1206w" sizes="(min-width: 720px) 720px"></figure><p>But if you want software that actually does something, reliably, you&#x2019;ll need more than a prompt in a chatbot for now. That&#x2019;s where developers live.</p><h2 id="this-is-a-lot">This is a lot</h2><p>I am not a coder. I was once a product manager. Despite the fact that friends bombard me with questions of &#x201C;what&#x2019;s going to happen with AI?&#x201D; <strong>I am barely keeping up.</strong></p><p>I&#x2019;m using Claude Code, and realizing my limitations in doing so, which is what led to this post. The realization is in large part to a chat group I&#x2019;m in with a few dozen <em>very</em> smart coders. Some of them I have admired for many years not just for their raw skill, but for their thoughtfulness about how technology will affect society. I am barely keeping up with what they&#x2019;re talking about.</p><p>I have my excuses. I have a day job&#x2014;several of them, in fact. Meanwhile, some of these people of them have literally taken 3-month sabbaticals to just immerse themselves in this because it is the single biggest advance of their ability to Make Things in their lives. They&#x2019;re excited. We&#x2019;re not sleeping.</p><p>And when they&#x2019;re honest, they aren&#x2019;t keeping up either.</p>]]></content:encoded></item><item><title><![CDATA[Boots n Cats: 2026 Pirate Fundraiser]]></title><description><![CDATA[A two-hour DJ set from the 2026 Pirate Fundraiser, spanning Latin house, techno, and electronic grooves.]]></description><link>https://www.alistaircroll.com/updates/boots-n-cats-2026-pirate-fundraiser/</link><guid isPermaLink="false">6997d909a674da0001b3e745</guid><category><![CDATA[DJ Sets]]></category><category><![CDATA[Music]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Sun, 01 Feb 2026 19:29:59 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/IMG_9106.jpeg"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/IMG_9106.jpeg" alt="Boots n Cats: 2026 Pirate Fundraiser"><p>A nearly two-hour DJ set recorded at the 2026 Pirate Fundraiser for the Boots n Cats camp. The set spans Latin-flavored house, techno, and electronic grooves across 28 tracks. The audio picks up a couple of songs in after some early transition hiccups, then finds its groove and doesn&#x2019;t let go.</p><figure class="kg-card kg-embed-card"><iframe width="100%" height="400" scrolling="no" frameborder="no" src="https://w.soundcloud.com/player/?visual=true&amp;url=https%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F2258312492&amp;show_artwork=true"></iframe></figure>]]></content:encoded></item><item><title><![CDATA[Is it Canada’s Estonia moment? A conversation with Joel Burke]]></title><description><![CDATA[A conversation with author Joel Burke about how to reboot a nation—and whether it’s Canada's turn]]></description><link>https://www.alistaircroll.com/updates/is-it-canadas-estonia-moment-a-conversation-with-joel-burke/</link><guid isPermaLink="false">699866eea674da0001b3ea38</guid><category><![CDATA[Fxgov]]></category><category><![CDATA[Video]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Tue, 23 Dec 2025 00:26:49 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/fxgov_img_1.png"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/fxgov_img_1.png" alt="Is it Canada&#x2019;s Estonia moment? A conversation with Joel Burke"><p>On the Functional Government Podcast, I sat down with Joel Burke &#x2014; author of Rebooting a Nation and advisor to the US Senate on AI policy &#x2014; to talk about Estonia&#x2019;s extraordinary digital transformation, and whether Canada could follow the same path.</p><p>Estonia went from breakaway Soviet republic to digital pioneer in a generation. Citizens can do virtually anything online: pay taxes, register businesses, even file for divorce. Joel lived in Tallinn and worked on their digital residency program. He estimates the country saves 2% of GDP every year through digital services. If you spend any time around government modernization, someone will mention Estonia &#x2014; this conversation goes deeper into how they actually did it.</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/uAYKkGseX1Y?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="Is it Canada&apos;s Estonia moment? With special guest Joel Burke"></iframe></figure>
<!--kg-card-begin: html-->
<p>Read the full writeup and show notes on <a href="https://www.fxgov.com/p/is-this-canadas-estonia-moment?ref=alistaircroll.com">The Functional Government Podcast</a>.</p>
<!--kg-card-end: html-->

<!--kg-card-begin: html-->
<p>If you think Canadians deserve better government services, and want to understand why that&#x2019;s hard and what to do about it, follow FxGov on <a href="https://www.youtube.com/@fxgov?ref=alistaircroll.com">Youtube</a> or <a href="https://www.fxgov.com/?ref=alistaircroll.com">Substack</a>.</p>
<!--kg-card-end: html-->
]]></content:encoded></item><item><title><![CDATA[Canada almost fixed passports a decade ago]]></title><description><![CDATA[What happened when we tried to modernize the agile way]]></description><link>https://www.alistaircroll.com/updates/canada-almost-fixed-passports-a-decade-ago/</link><guid isPermaLink="false">699866f1a674da0001b3ea3f</guid><category><![CDATA[Fxgov]]></category><category><![CDATA[Video]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Tue, 16 Dec 2025 01:18:09 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/fxgov_img_2.png"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/fxgov_img_2.png" alt="Canada almost fixed passports a decade ago"><p>Did you know Canada almost launched online passport applications back in 2013? On the Functional Government Podcast, I dug into what happened when we tried to modernize the process &#x2014; and why it didn&#x2019;t stick.</p><p>The biggest obstacle wasn&#x2019;t technology. It was the form itself: different people need different parts, the rules change based on where you were born and whether your name has changed, and it&#x2019;s where most mistakes happen. Getting people to fill out the form correctly online was the unsexy but essential challenge. This episode explores what happened when an agile approach met bureaucratic reality.</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/9yHO40FMoRk?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="Canada tried to fix passports a decade ago. Here&apos;s what happened."></iframe></figure>
<!--kg-card-begin: html-->
<p>Read the full writeup and show notes on <a href="https://www.fxgov.com/p/canada-nearly-put-online-passportsin?ref=alistaircroll.com">The Functional Government Podcast</a>.</p>
<!--kg-card-end: html-->

<!--kg-card-begin: html-->
<p>If you think Canadians deserve better government services, and want to understand why that&#x2019;s hard and what to do about it, follow FxGov on <a href="https://www.youtube.com/@fxgov?ref=alistaircroll.com">Youtube</a> or <a href="https://www.fxgov.com/?ref=alistaircroll.com">Substack</a>.</p>
<!--kg-card-end: html-->
]]></content:encoded></item><item><title><![CDATA[Why fake boldface is bad for LinkedIn posts]]></title><description><![CDATA[Fake boldface is killing the accessibility of LinkedIn posts. Because there’s no bold/italic markup, people use tools like Lingojam to make 𝗕𝗼𝗹𝗱𝗳𝗮𝗰𝗲. But these aren’t boldfaced versions of letters.]]></description><link>https://www.alistaircroll.com/updates/fake-boldface-is-killing-the-accessibility-of-the-internet/</link><guid isPermaLink="false">6997de93a674da0001b3e88f</guid><category><![CDATA[Privacy]]></category><category><![CDATA[SFI]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Mon, 15 Dec 2025 12:00:00 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-22-at-12.03.08---PM-1.png"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/Screenshot-2026-02-22-at-12.03.08---PM-1.png" alt="Why fake boldface is bad for LinkedIn posts"><p>Fake boldface is killing the accessibility of LinkedIn posts. Because there&#x2019;s no bold/italic markup, people use tools like Lingojam to make &#x1D5D5;&#x1D5FC;&#x1D5F9;&#x1D5F1;&#x1D5F3;&#x1D5EE;&#x1D5F0;&#x1D5F2;.</p><p>But these aren&#x2019;t boldfaced versions of letters. They&#x2019;re Unicode variants (this is why when you copy the text of this post, paste it into a word processor, and try to un-bold it, it doesn&#x2019;t work. They&#x2019;re entirely different letters.)</p><p>This may seem cool. Some growth hacker might even have recommended that you use them. But it sucks for many users.</p><p>Because anyone using a screen reader or accessibility device won&#x2019;t hear that letter &#x201C;&#x1D400;&#x201D;. They&#x2019;ll hear &#x201C;U+1D400 : MATHEMATICAL BOLD CAPITAL A&#x201D;</p><p>What&#x2019;s more, if you write &#x201C;&#x1D468;&#x1D484;&#x1D48E;&#x1D486;&#x1D484;&#x1D490;&#x1D493;&#x1D491;&#x201D; then people searching for &#x201C;Acmecorp&#x201D; won&#x2019;t find it, because those aren&#x2019;t the letters you used.</p><p>LinkedIn should probably get with the times and support rich text for stuff like bold, italics, underline, and strikethrough. But in the meantime, you should definitely stop using those &#x201C;make my text bold&#x201D; websites.</p><p>Oh also: Those boldface sites install tracking cookies, and are often run by offshore sites on dubious servers.</p>]]></content:encoded></item><item><title><![CDATA[How Ireland fixed its passport system]]></title><description><![CDATA[From round-the-block lineups to printing in minutes, we talked to someone who was there]]></description><link>https://www.alistaircroll.com/updates/how-ireland-fixed-its-passport-system/</link><guid isPermaLink="false">699866f3a674da0001b3ea44</guid><category><![CDATA[Fxgov]]></category><category><![CDATA[Video]]></category><dc:creator><![CDATA[Alistair Croll]]></dc:creator><pubDate>Mon, 08 Dec 2025 21:14:59 GMT</pubDate><media:content medium="image" url="https://www.alistaircroll.com/content/images/2026/02/fxgov_img_3.png"/><content:encoded><![CDATA[<img src="https://www.alistaircroll.com/content/images/2026/02/fxgov_img_3.png" alt="How Ireland fixed its passport system"><p>Ireland once had such terrible passport lineups that entrepreneurial Dubliners would show up early, grab queue numbers, and sell their spots to people who arrived later. On the Functional Government Podcast, I talked to someone who was there when they transformed it into one of the fastest passport systems in Europe.</p><p>The turnaround destroyed dozens of jobs &#x2014; queue-number scalpers &#x2014; yet was widely celebrated. Everything used to be physical and slow. Now Ireland processes passport applications in minutes. This is a story about what happens when a government decides to actually fix something, and the surprising human side effects of modernization.</p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/BXzj1DjuD70?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="How Ireland reformed passport applications with John Savage"></iframe></figure>
<!--kg-card-begin: html-->
<p>Read the full writeup and show notes on <a href="https://www.fxgov.com/p/how-ireland-fixed-its-passport-system?ref=alistaircroll.com">The Functional Government Podcast</a>.</p>
<!--kg-card-end: html-->

<!--kg-card-begin: html-->
<p>If you think Canadians deserve better government services, and want to understand why that&#x2019;s hard and what to do about it, follow FxGov on <a href="https://www.youtube.com/@fxgov?ref=alistaircroll.com">Youtube</a> or <a href="https://www.fxgov.com/?ref=alistaircroll.com">Substack</a>.</p>
<!--kg-card-end: html-->
]]></content:encoded></item></channel></rss>