<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xml" href="https://jonnyzzz.com/feed.xslt.xml"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <link href="https://jonnyzzz.com/feed-internal.xml" rel="self" type="application/atom+xml" />
  <link href="https://jonnyzzz.com/" rel="alternate" type="text/html" />
  <updated>2026-04-18T15:13:57+00:00</updated>
  <id>/</id>

  
  <title type="html">Eugene Petrenko</title>
  

  
  <subtitle>Founding Engineering Leader | Agentic AI DevTools &amp; Experience</subtitle>
  

  

  
  
  <entry>
    <title type="html">IntelliJ Plugin Hot-Reload: HTTP 413</title>
    <link href="https://jonnyzzz.com/blog/2026/04/17/intellij-plugin-hot-reload-413/" rel="alternate" type="text/html" title="IntelliJ Plugin Hot-Reload: HTTP 413" />
    <published>2026-04-17T00:00:00+00:00</published>
    <updated>2026-04-17T00:00:00+00:00</updated>
    <id>/blog/2026/04/17/intellij-plugin-hot-reload-413</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/04/17/intellij-plugin-hot-reload-413/">&lt;p&gt;185 MB. That’s where my IntelliJ plugin stopped hot-reloading.&lt;/p&gt;

&lt;p&gt;I’ve been working on &lt;a href=&quot;https://github.com/jonnyzzz/mcp-steroid&quot;&gt;MCP Steroid&lt;/a&gt;, an IntelliJ plugin
that exposes IDE APIs to LLM agents via an MCP server. It bundles a Kotlin compiler, an OCR
engine, and enough other dependencies to reach &lt;strong&gt;185 MB&lt;/strong&gt; as a ZIP.&lt;/p&gt;

&lt;p&gt;That size mattered because MCP Steroid changes fast. An agent tries a skill, hits an edge case, I
fix the handler, redeploy, and the agent tries again. Restarting the IDE for every build kills
that feedback loop. That is why I built
&lt;a href=&quot;https://github.com/jonnyzzz/intellij-plugin-hot-reload&quot;&gt;intellij-plugin-hot-reload&lt;/a&gt;
(&lt;a href=&quot;/blog/2026/01/05/intellij-plugin-hot-reload/&quot;&gt;background post&lt;/a&gt;): an HTTP endpoint
on IntelliJ’s built-in server on port 63342 that deploys plugins without a restart. My Gradle
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deployPlugin&lt;/code&gt; task finds running IDEs via marker files, POSTs the ZIP, and the IDE reloads
the plugin on the fly.&lt;/p&gt;

&lt;p&gt;Then I got this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;→ http://localhost:63342/api/plugin-hot-reload
  ✗ HTTP 413
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;the-investigation&quot;&gt;The Investigation&lt;/h2&gt;

&lt;p&gt;I first assumed the bug was in my handler. It wasn’t. IntelliJ returned 413 before my code ever
ran. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;curl -v&lt;/code&gt; showed the same thing:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl &lt;span class=&quot;nt&quot;&gt;-v&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-X&lt;/span&gt; POST &lt;span class=&quot;s2&quot;&gt;&quot;http://localhost:63342/api/plugin-hot-reload&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;-H&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Authorization: Bearer &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$TOKEN&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;-H&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Content-Type: application/octet-stream&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;nt&quot;&gt;--data-binary&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;@mcp-steroid-185mb.zip&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; Content-Length: 194059789
&amp;gt; Expect: 100-continue
&amp;gt;
&amp;lt; HTTP/1.1 413 Request Entity Too Large
&amp;lt; content-length: 0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;An empty body. No hint. Just 413.&lt;/p&gt;

&lt;p&gt;IntelliJ’s built-in HTTP server uses Netty. The
&lt;a href=&quot;https://github.com/JetBrains/intellij-community/blob/master/platform/platform-util-netty/src/org/jetbrains/ide/HttpRequestHandler.kt#L80-L87&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HttpRequestHandler&lt;/code&gt;&lt;/a&gt;
extension point receives a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FullHttpRequest&lt;/code&gt;. That word matters. Netty’s
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HttpObjectAggregator&lt;/code&gt; builds the full request body before the handler runs, and it returns 413
when the request exceeds its limit.&lt;/p&gt;

&lt;p&gt;The aggregator is added in
&lt;a href=&quot;https://github.com/JetBrains/intellij-community/blob/master/platform/platform-util-netty/src/org/jetbrains/io/NettyUtil.java#L107-L109&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NettyUtil.addHttpServerCodec()&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;pipeline&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addLast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;httpObjectAggregator&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HttpObjectAggregator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;MAX_CONTENT_LENGTH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAX_CONTENT_LENGTH&lt;/code&gt; comes from
&lt;a href=&quot;https://github.com/JetBrains/intellij-community/blob/master/platform/platform-util-netty/src/org/jetbrains/io/NettyUtil.java#L34-L43&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NettyUtil.java&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;final&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MAX_CONTENT_LENGTH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;

&lt;span class=&quot;kd&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxContentLength&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;180&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;maxContentLength&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;parseInt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;getProperty&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;ide.netty.max.frame.size.in.mb&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;180&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;NumberFormatException&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ignore&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;MAX_CONTENT_LENGTH&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maxContentLength&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;180 MB. My ZIP is 185 MB.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The pipeline is wired in
&lt;a href=&quot;https://github.com/JetBrains/intellij-community/blob/master/platform/built-in-server/src/org/jetbrains/io/PortUnificationServerHandler.java#L100-L102&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PortUnificationServerHandler&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-java highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isHttp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;magic1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;magic2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nc&quot;&gt;NettyUtil&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addHttpServerCodec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pipeline&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;pipeline&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;addLast&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;delegatingHttpHandler&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;delegatingHttpRequestHandler&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Every request on IntelliJ’s built-in server goes through that path. There is no per-handler
override. Once I saw that, the shape of the problem changed: fight the limit, or stop uploading
the body.&lt;/p&gt;

&lt;h2 id=&quot;first-idea-another-http-server&quot;&gt;First Idea: Another HTTP Server&lt;/h2&gt;

&lt;p&gt;My first workaround was obvious: spin up a separate Netty or Ktor server with a larger body limit.
&lt;strong&gt;It was the idea that my AI Agent suggested as a too straightforward solution that I rejected. Too much
work for that simple use-case.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It would work, but it would also mean:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Managing a second server lifecycle, plus port allocation and conflicts&lt;/li&gt;
  &lt;li&gt;Teaching every client a second endpoint convention&lt;/li&gt;
  &lt;li&gt;Debugging one more moving part when reload fails&lt;/li&gt;
  &lt;li&gt;Reimplementing authentication, TLS, and shutdown concerns the built-in server already handles&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I rejected building it. Then I ask to find a better solution.&lt;/p&gt;

&lt;h2 id=&quot;better-solution-pass-a-local-path&quot;&gt;Better Solution: Pass a Local Path&lt;/h2&gt;

&lt;p&gt;The hot-reload plugin and the Gradle build run on the same machine. The ZIP already exists on
disk. I was uploading 185 MB over loopback so the server could write it to a temp file and hand
that path to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PluginInstaller.installAndLoadDynamicPlugin()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The file was already local. I only needed to pass the path.&lt;/p&gt;

&lt;p&gt;On the server side, the handler checks a query parameter before touching the request body:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;localDiskFile&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urlDecoder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;local-disk-file&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;?.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;firstOrNull&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;localDiskFile&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;localDiskFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(!&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;sendError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;request&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;nc&quot;&gt;HttpResponseStatus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;BAD_REQUEST&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;&quot;File not found: $localDiskFile&quot;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;executeReload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;context&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reloadService&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;progress&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;reloadService&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reloadPluginFromZipFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;progress&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reloadPluginFromZipFile&lt;/code&gt; reads the ZIP from disk, extracts the plugin ID, and passes the path to
IntelliJ’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;installAndLoadDynamicPlugin()&lt;/code&gt;. My code never creates a 185 MB byte array.&lt;/p&gt;

&lt;p&gt;On the client side, the Gradle task got smaller too:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;encodedPath&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;URLEncoder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;encode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;absolutePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Charsets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;UTF_8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;conn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
  &lt;span class=&quot;nc&quot;&gt;URI&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;$url?local-disk-file=$encodedPath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toURL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;openConnection&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HttpURLConnection&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;apply&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;requestMethod&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;POST&quot;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;doOutput&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;false&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;doInput&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;nf&quot;&gt;setRequestProperty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Authorization&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;connectTimeout&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5_000&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;readTimeout&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;300_000&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;nf&quot;&gt;check&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;responseCode&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;299&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;HTTP ${conn.responseCode}&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;conn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;inputStream&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;bufferedReader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;lineSequence&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;forEach&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;  $it&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;No body. No &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Content-Type&lt;/code&gt;. Just a short authenticated POST with a local path.&lt;/p&gt;

&lt;h2 id=&quot;why-this-is-better&quot;&gt;Why This Is Better&lt;/h2&gt;

&lt;p&gt;This wasn’t just a workaround. It was the design I should’ve used from the start.&lt;/p&gt;

&lt;p&gt;The body-upload path forces Netty to aggregate the entire ZIP before my handler runs. At 185 MB,
that’s a large allocation even on the happy path. If someone gets the Bearer token, it is also an
easy memory amplifier.&lt;/p&gt;

&lt;p&gt;The file-path flow avoids that:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;No 185 MB buffer. The HTTP layer sees an empty body and hands a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Path&lt;/code&gt; to the plugin installer.&lt;/li&gt;
  &lt;li&gt;The ZIP stays on disk instead of moving through a socket and an in-memory aggregator.&lt;/li&gt;
  &lt;li&gt;The handler can verify existence, size, and, if needed, a checksum before the reload starts.&lt;/li&gt;
  &lt;li&gt;More secure: you need a plugin file on the disk, even if you can reach the localhost-bound web-server and somehow know the token.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a dramatic security boundary change. The caller is already authenticated and trusted
to supply a local path. But it replaces “accept arbitrary bytes over the network” with
“read this specific file.” That’s a better default.&lt;/p&gt;

&lt;p&gt;I kept the body-upload path for plugins under 180 MB and for the rare case where client and
server do not share a filesystem. Backward compatibility was just an early &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if&lt;/code&gt; in the handler.&lt;/p&gt;

&lt;h2 id=&quot;the-result&quot;&gt;The Result&lt;/h2&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;→ http://localhost:63342/api/plugin-hot-reload?local-disk-file=%2Fpath%2Fto%2Fplugin.zip
  Authorization: Bearer $TOKEN
  Starting plugin hot reload, zip size: 193,995,401 bytes
  Plugin ID: com.jonnyzzz.mcp-steroid
  Unloading existing plugin: MCP Steroid
  Plugin unloaded successfully
  Installing and loading plugin: MCP Steroid (0.92.0)
  Plugin MCP Steroid reloaded successfully
  SUCCESS
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;185 MB plugin. Zero bytes in the request body. Full hot reload in under 10 seconds. Better security.&lt;/p&gt;

&lt;h2 id=&quot;takeaways&quot;&gt;Takeaways&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;When an error is opaque, read the source.&lt;/strong&gt; The 413 had zero diagnostic information. Reading
IntelliJ’s
&lt;a href=&quot;https://github.com/JetBrains/intellij-community/blob/master/platform/platform-util-netty/src/org/jetbrains/io/NettyUtil.java#L34-L43&quot;&gt;NettyUtil.java&lt;/a&gt;
made the limit obvious.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Question the transport, not just the limit.&lt;/strong&gt; My first instinct was “make the limit bigger.”
The better question was “why am I transferring this data at all?”&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Localhost is still a network hop.&lt;/strong&gt; Even loopback HTTP goes through TCP, Netty codecs, and
aggregation. A local path skips that entire stack.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Global system properties are not a great API.&lt;/strong&gt;
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ide.netty.max.frame.size.in.mb&lt;/code&gt; is global and resolved at class load time. Not something I
want users editing in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;idea.vmoptions&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The hot-reload plugin with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;?local-disk-file=&lt;/code&gt; support is at
&lt;a href=&quot;https://github.com/jonnyzzz/intellij-plugin-hot-reload&quot;&gt;jonnyzzz/intellij-plugin-hot-reload&lt;/a&gt;
(v1.0.0). If your IntelliJ plugin is creeping toward 180 MB — or you just want a tighter
plugin development loop — give it a try.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;I’m looking forward to seeing that functionality supported natively in IntelliJ and IntelliJ SDK.
That will make IntelliJ plugin development much easier. Promote that post, and I’ll be happy to
contribute the fixes.&lt;/p&gt;

&lt;p&gt;Related posts: &lt;a href=&quot;/blog/2026/03/24/agentic-experience-and-tools/&quot;&gt;Agentic Experience and Tools&lt;/a&gt;,
&lt;a href=&quot;/blog/2026/04/07/mcp-steroid-open-source/&quot;&gt;MCP Steroid Is Now Open Source&lt;/a&gt;, and
&lt;a href=&quot;/blog/2026/04/08/mcp-steroid-skill-factory/&quot;&gt;IntelliJ as a Skill Factory&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="intellij" />
  
    <category term="plugins" />
  
    <category term="netty" />
  
    <category term="hot-reload" />
  
    <category term="mcp-steroid" />
  
    <summary type="html">A 185 MB IntelliJ plugin hit IntelliJ&apos;s 180 MB Netty request-body limit and broke hot reload. The fix was not to raise the limit, but to stop uploading the ZIP and pass a local file path instead.</summary>
  
  </entry>
  
  <entry>
    <title type="html">Building an IntelliJ Plugin That Works Across Multiple IDE Versions: 7 Approaches</title>
    <link href="https://jonnyzzz.com/blog/2026/04/15/intellij-plugin-multi-version-kotlin/" rel="alternate" type="text/html" title="Building an IntelliJ Plugin That Works Across Multiple IDE Versions: 7 Approaches" />
    <published>2026-04-15T00:00:00+00:00</published>
    <updated>2026-04-15T00:00:00+00:00</updated>
    <id>/blog/2026/04/15/intellij-plugin-multi-version-kotlin</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/04/15/intellij-plugin-multi-version-kotlin/">&lt;p&gt;I have been building &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;MCP Steroid&lt;/a&gt; — an IntelliJ plugin
that gives AI Agents access to the full IntelliJ Platform API at runtime. The plugin needs to work
across IntelliJ 2025.3 (build 253), 2026.1 (build 261), and the upcoming 2026.2 EAP.&lt;/p&gt;

&lt;p&gt;Right now the plugin uses a workaround: pin a specific Kotlin compiler version that happens to be
binary-compatible one IJ generation before and after the target. It works — but it becomes more
fragile with every Kotlin release, and with the K2.2 → K2.3 break between IJ 253 and 261,
it is reaching its limit.&lt;/p&gt;

&lt;p&gt;I also cannot just drop support for IJ 253. That version family covers Android Studio forks.
Dropping it and then needing it back later is expensive — users do not return easily once they are
forced to an alternative. So 253 stays as the floor, and I need to support everything up through
the current EAP trunk.&lt;/p&gt;

&lt;p&gt;That sounds like a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sinceBuild&lt;/code&gt; / &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;untilBuild&lt;/code&gt; problem. It is not.&lt;/p&gt;

&lt;p&gt;When I started investigating, I found three separate, deeply entangled problems — and one of them
has no elegant solution in the current Gradle toolchain. This post is the story of those problems
and the seven approaches I explored. The PoC for approaches 1 + 5 is at
&lt;a href=&quot;https://github.com/jonnyzzz/ij-multi-version&quot;&gt;github.com/jonnyzzz/ij-multi-version&lt;/a&gt;.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;table-of-contents&quot;&gt;Table of Contents&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#the-problem-in-concrete-terms&quot;&gt;The Problem in Concrete Terms&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-seven-approaches&quot;&gt;The Seven Approaches&lt;/a&gt;
    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#option-1-symlinks--multiple-gradle-subprojects&quot;&gt;Option 1: Symlinks + Multiple Gradle Subprojects&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#option-2-branch-per-ij-version&quot;&gt;Option 2: Branch per IJ Version&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#option-3-reflection&quot;&gt;Option 3: Reflection&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#option-4-shims--typed-adapters&quot;&gt;Option 4: Shims / Typed Adapters&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#option-5-template-build-scripts&quot;&gt;Option 5: Template Build Scripts&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#option-6-simplify-intellijs-api-surface&quot;&gt;Option 6: Simplify IntelliJ’s API Surface&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#option-7-docker-based-build-compatibility-tests&quot;&gt;Option 7: Docker-based Build Compatibility Tests&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-single-invocation-approach-what-jetbrains-uses-internally&quot;&gt;The Single-Invocation Approach: What JetBrains Uses Internally&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#build-time-verification-classfile-reference-checking&quot;&gt;Build-time Verification: ClassFile Reference Checking&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#build-issues-i-hit-along-the-way&quot;&gt;Build Issues I Hit Along the Way&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#scoring-summary&quot;&gt;Scoring Summary&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-jetbrains-could-do-better&quot;&gt;What JetBrains Could Do Better&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#where-to-start&quot;&gt;Where to Start&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-i-am-doing-for-mcp-steroid&quot;&gt;What I Am Doing for MCP Steroid&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-problem-in-concrete-terms&quot;&gt;The Problem in Concrete Terms&lt;/h2&gt;

&lt;h3 id=&quot;the-kotlin-version-coupling&quot;&gt;The Kotlin Version Coupling&lt;/h3&gt;

&lt;p&gt;IntelliJ is built with Kotlin and bundles Kotlin libraries. When your plugin runs inside
IJ, it uses the IDE’s bundled Kotlin runtime — not the one you compiled against. A plugin
should try the best to avoid bundinng Kotlin libaries, which are present in the IDE classpath.
JetBrains documents the version mapping officially on the
&lt;a href=&quot;https://plugins.jetbrains.com/docs/intellij/using-kotlin.html&quot;&gt;Configuring Kotlin Support&lt;/a&gt; page:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;IJ Version&lt;/th&gt;
      &lt;th&gt;Bundled Kotlin stdlib&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;2026.1 (261)&lt;/td&gt;
      &lt;td&gt;2.3.20&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2025.3 (253)&lt;/td&gt;
      &lt;td&gt;2.2.20&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2025.2 (252)&lt;/td&gt;
      &lt;td&gt;2.1.20&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2025.1 (251)&lt;/td&gt;
      &lt;td&gt;2.1.10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2024.3 (243)&lt;/td&gt;
      &lt;td&gt;2.0.21&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2024.2 (242)&lt;/td&gt;
      &lt;td&gt;1.9.24&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The 2.2.20 → 2.3.20 jump between 253 and 261 is not a routine version bump. JetBrains confirmed
directly on the platform forum:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;“unfortunately it’s indeed impossible to use kotlin 2.2.20 anymore for IJ 261.
2.3.0/2.3.10 should be used instead.”&lt;/em&gt;
— Anna Kozlova (JetBrains), &lt;a href=&quot;https://platform.jetbrains.com/t/kotlin-plugin-compiled-with-later-kotlin-version-in-261-21525-39-eap-snapshot/3734&quot;&gt;IJ Platform Forum, Feb 2026&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This matters specifically if your plugin uses Kotlin compiler APIs — PSI trees, the Analysis API,
anything from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.jetbrains.kotlin.compiler.client.impl&lt;/code&gt;. Binary metadata changed between K2.2
and K2.3 in a way that is not backward-compatible for compiler uses.&lt;/p&gt;

&lt;p&gt;Kotlin version dimension: not only must the plugin be compiled against the right Kotlin, but
the Kotlin compiler it invokes at runtime must also match the bundled version in the running IDE.
And the Kotlin libraries (such as Kotlin Coroutines) must be compatible.&lt;/p&gt;

&lt;p&gt;For &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com&quot;&gt;MCP Steroid&lt;/a&gt;, which executes
Kotlin code snippets programmatically, we use the dedicated &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinc&lt;/code&gt; compiler binary. 
The version of the compiler is selected specifically to support the 2.2 version in 253, 
and 2.4.0-beta in the recent versions of IntelliJ. 
Thanks to the binary backward/forward compatibility, that is possible, and it
simplifies the plugin’s setup a lot. We are not using the embedded Kotlin compiler 
because it’s not included in all IntelliJ-based IDEs, e.g., Rider, PyCharm.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PlatformKotlinVersions&lt;/code&gt; map in the IJ Platform Gradle Plugin initially lacked a 261 entry,
silently defaulting to K2.2.20. This caused cryptic compile failures with no diagnostic. Fixed in
&lt;a href=&quot;https://github.com/JetBrains/intellij-platform-gradle-plugin/blob/main/CHANGELOG.md&quot;&gt;IJPGP v2.12.0&lt;/a&gt;
(released 2026-03-06), which mapped 261 → 2.3.10. The final IJ 2026.1 GA ships 2.3.20, as Kotlin
2.3.20 was released ten days after the IJPGP fix landed.&lt;/p&gt;

&lt;p&gt;If your plugin does not use compiler APIs — only pure Kotlin language features — the situation is
less severe. The &lt;a href=&quot;https://github.com/intellij-rust/intellij-rust&quot;&gt;intellij-rust&lt;/a&gt; plugin handles
this by pinning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;apiVersion.set(KotlinVersion.KOTLIN_1_8)&lt;/code&gt; even when the IDE bundles 2.x,
trading access to newer language features for version-range freedom. A valid strategy if your
plugin does not need Kotlin 1.9+ features.&lt;/p&gt;

&lt;h3 id=&quot;you-cannot-parametrize-the-kotlin-compiler-in-gradle&quot;&gt;You Cannot Parametrize the Kotlin Compiler in Gradle&lt;/h3&gt;

&lt;p&gt;This is the root cause of why all the “just add a flag” solutions fail.&lt;/p&gt;

&lt;p&gt;In a standard Gradle build, the Kotlin compiler version is a global property — it comes from
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plugins { kotlin(&quot;jvm&quot;) version &quot;X.Y.Z&quot; }&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build.gradle.kts&lt;/code&gt;. There is no built-in
mechanism to compile one subproject with KGP 2.2.x and another with KGP 2.3.x in the same build
invocation. The IJ Platform Gradle Plugin resolves the target SDK classpath per subproject, but
the &lt;strong&gt;Kotlin language and API version&lt;/strong&gt; is a property of the KGP loaded into Gradle’s script
classloader — not of the SDK.&lt;/p&gt;

&lt;p&gt;One Gradle invocation, one KGP version, period. This is the constraint that drives the entire
multi-subproject architecture. Any solution to the multi-version problem that requires different
Kotlin compiler versions &lt;em&gt;must&lt;/em&gt; use separate Gradle invocations: separate subprojects,
separate CI jobs, or Docker containers.&lt;/p&gt;

&lt;p&gt;Alternative – use different Kotlin plugin versions in different sibling projects, and
patent Gradle projects must not have KGP in the classpath. That makes setup even more complicated.&lt;/p&gt;

&lt;h3 id=&quot;bundled-libraries-change-between-ij-versions&quot;&gt;Bundled Libraries Change Between IJ Versions&lt;/h3&gt;

&lt;p&gt;It is not just Kotlin. IntelliJ also bundles:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinx.coroutines&lt;/code&gt;&lt;/strong&gt;: Since IJ 2024.2, JetBrains ships a patched fork at
&lt;a href=&quot;https://github.com/JetBrains/intellij-deps-kotlinx.coroutines&quot;&gt;github.com/JetBrains/intellij-deps-kotlinx.coroutines&lt;/a&gt;
under the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;org.jetbrains.intellij.deps.kotlinx&lt;/code&gt; group. If your plugin bundles its own coroutines
version, you will hit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ClassCastException&lt;/code&gt; at runtime when the bundled and shipped versions load
the same class name through different classloaders. The fix: declare coroutines as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compileOnly&lt;/code&gt;
and set &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin.stdlib.default.dependency=false&lt;/code&gt;. But that means compiling against the oldest
bundled version you support.&lt;/p&gt;

&lt;p&gt;Pro-Tip: Set up the assertion to verify all the bundled libraries to your plugin. Verify that the
you are not re-packaging the same libraries into your plugin’s lib folder.&lt;/p&gt;

&lt;p&gt;Approximate bundled versions: IJ 252 → &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinx-coroutines 1.10.1-intellij-4&lt;/code&gt;; IJ 261 →
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.10.2-intellij-1&lt;/code&gt; (&lt;a href=&quot;https://github.com/aws/aws-toolkit-jetbrains/pull/6331&quot;&gt;aws-toolkit-jetbrains PR #6331&lt;/a&gt;).
The IJ 253 version is not in the official docs table but follows the same pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kotlin serialization&lt;/strong&gt; and other bundled libraries follow the same pattern.&lt;/p&gt;

&lt;h3 id=&quot;ij-platform-apis-change-without-a-stability-sla&quot;&gt;IJ Platform APIs Change Without a Stability SLA&lt;/h3&gt;

&lt;p&gt;JetBrains does not publish a formal API stability SLA. Per-API signals:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ApiStatus.ScheduledForRemoval(inVersion=&quot;...&quot;)&lt;/code&gt; — will be removed; the version attribute tells
you when to act&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ApiStatus.Internal&lt;/code&gt; — no compatibility promise; may break in a patch release&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ApiStatus.Experimental&lt;/code&gt; — subject to change without notice&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href=&quot;https://plugins.jetbrains.com/docs/intellij/api-changes-list-2026.html&quot;&gt;API changes list for 2026&lt;/a&gt;
documents what broke. Deprecated APIs are typically removed after 2–4 major versions. Plugin
Verifier warnings on EAP builds give 1–3 months of lead time — but only if you are running
Plugin Verifier against those EAP builds.&lt;/p&gt;

&lt;p&gt;One concrete 2025.3 change: the unified IJ distribution merged Community and Ultimate, deprecating
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;intellijIdeaCommunity()&lt;/code&gt; in the Gradle plugin in favour of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;intellijIdea()&lt;/code&gt;. The EAP SNAPSHOT
artifacts still accept the old function call for now, but new plugin projects should use
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;intellijIdea()&lt;/code&gt; from the start.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-seven-approaches&quot;&gt;The Seven Approaches&lt;/h2&gt;

&lt;h3 id=&quot;option-1-symlinks--multiple-gradle-subprojects&quot;&gt;Option 1: Symlinks + Multiple Gradle Subprojects&lt;/h3&gt;

&lt;p&gt;Create one Gradle subproject per target IJ version. Each has its own &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build.gradle.kts&lt;/code&gt; with the
correct &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinLang&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ijSdk&lt;/code&gt; constants. Shared plugin sources live in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-includes/&lt;/code&gt;. Each
subproject includes the shared sources via &lt;strong&gt;filesystem symlinks&lt;/strong&gt; into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-symlink/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Why symlinks? IntelliJ IDEA requires that each source root has a unique physical path. Two modules
pointing to the same directory confuse IDEA’s indexing. Symlinks give each module a unique path
that points to the same files.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;“It creates quite a horrible setup on the IntelliJ side… because there is no standard way
to mount the same sources for multiple modules.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The source directory naming convention:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;src-includes/main/
├── kotlin/          ← always compiled for all versions
├── java/            ← always compiled
├── resources/       ← always compiled
├── src-253/         ← only when ijMajor == 253
├── src-261/         ← only when ijMajor == 261
├── src-253-since/   ← when ijMajor &amp;gt;= 253
└── src-261-until/   ← when ijMajor &amp;lt;= 261
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Each subproject resolves which directories to link via a regex:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;folderMatchesMarker&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;folderName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Boolean&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;folderName&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;setOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;java&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;kotlin&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;resources&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Regex&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;src-(\d{3})(?:-(since|until))?&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;matchEntire&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;folderName&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;?:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Unrecognized directory &apos;$folderName&apos; in src-includes/&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;nnn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;groupValues&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toInt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;groupValues&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;since&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ijMajor&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nnn&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;until&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ijMajor&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nnn&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;    &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ijMajor&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nnn&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The PoC also includes a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;checkVersionCoverage&lt;/code&gt; task that validates there are no gaps in the
supported version range. It uses the IJ build-number formula — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(year − 2000) × 10 + quarter&lt;/code&gt;,
with valid quarters 1–3 — to detect if, say, you support 253 and 261 but skipped 262 entirely
when 262 was required to be listed. Coverage gaps are caught at configuration time.&lt;/p&gt;

&lt;p&gt;Adding a new version: create &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ij-plugin/ij-NNN-kXY/build.gradle.kts&lt;/code&gt; (copy from existing, change
three constants at the top), add to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gradle/ij-builds.properties&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-properties highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# gradle/ij-builds.properties
&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;combinations&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;ij-253-k22,ij-261-k23&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Production precedent&lt;/strong&gt;: This approach is used in production with the IDE Services plugin, supporting
two years of IJ versions. It works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The advantage&lt;/strong&gt;: One &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew build&lt;/code&gt; compiles all versions. API breakages surface as compiler
errors immediately across the full version matrix. And the right-click on the selected &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;runIde&lt;/code&gt;
task would start the debugger with the selected version in the Click.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The cost&lt;/strong&gt;: The symlink layer is unfamiliar to most developers. IDEA’s module tree grows with
the version count. Building all versions locally is slower than building one. Dependency
resolution will take more time, since you need more heavy IDE packages to download.
The versioned source folders appears to have more code, one must carefully design and authorize
and new code added to the folders like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-251&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Best compile-time safety. Highest build complexity. Proven in production.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;option-2-branch-per-ij-version&quot;&gt;Option 2: Branch per IJ Version&lt;/h3&gt;

&lt;p&gt;Maintain a separate Git branch per major IJ version. Build and deploy from the correct branch.&lt;/p&gt;

&lt;p&gt;Every bug fix must be cherry-picked to all active branches. If you have three branches — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt;,
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;253&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;252&lt;/code&gt; — a single commit becomes three cherry-picks. Cherry-picks diverge. After six months
you have three codebases that are supposed to be the same plugin but differ in subtle ways you
cannot easily diff.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;“The changes must be cherry-picked to all branches to make it work. And it’s definitely
quite easy to fail it. We did that before.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;JetBrains’ SDK documentation acknowledges branches as a last resort: “In certain scenarios where
fundamental incompatibilities cannot be resolved through conditional logic, maintaining separate
branches for each major IDE version may be necessary.” Last resort is the right framing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Simple per-branch build. Terrible long-term maintenance. Hard no.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agentic&lt;/strong&gt; That is easy to implement with AI Agent and clear instructions to follow.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;option-3-reflection&quot;&gt;Option 3: Reflection&lt;/h3&gt;

&lt;p&gt;Call version-dependent APIs via Java reflection. Detect the IJ version at startup and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Method.invoke()&lt;/code&gt; the right implementation.&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;// Do not do this&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;method&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;WindowManager&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;java&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getMethod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;getAllProjectFrames&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;NoSuchMethodException&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;WindowManager&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;java&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getMethod&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;getProjectFrameHelpers&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// renamed in IJ 261&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Reflection turns build-time errors into runtime errors. If JetBrains removes the API entirely,
you will not know until a user reports a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NoSuchMethodException&lt;/code&gt; in production. The IDE cannot
help you — no autocomplete, no “find usages”, no refactoring support.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;em&gt;“The reflection will hide all the potential problems in the future and will create a much more
problematic approach.”&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Rejected. The ease of implementation does not justify losing compile-time safety.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;option-4-shims--typed-adapters&quot;&gt;Option 4: Shims / Typed Adapters&lt;/h3&gt;

&lt;p&gt;Create a thin adapter layer — small modules, one per version-incompatible API surface. Each shim
implements a shared interface. A selector class picks the right implementation at runtime.&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;// Shared interface&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;interface&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;listProjectWindows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;IdeFrame&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// shim-253 — compiled against IJ 253 SDK&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister253&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;listProjectWindows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;
        &lt;span class=&quot;nc&quot;&gt;WindowManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getAllProjectFrames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// shim-261 — compiled against IJ 261 SDK&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister261&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;listProjectWindows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;
        &lt;span class=&quot;nc&quot;&gt;WindowManager&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getProjectFrameHelpers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// Runtime selection&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;lister&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;ApplicationInfo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;baselineVersion&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;261&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister261&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WindowLister253&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/bazelbuild/intellij&quot;&gt;bazelbuild/intellij&lt;/a&gt; Bazel plugin is the canonical
production example. They maintain &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sdkcompat/v252/&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sdkcompat/v253/&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sdkcompat/v261/&lt;/code&gt; —
supporting two stable versions simultaneously plus one under-development. Their three patterns:
&lt;strong&gt;Compat&lt;/strong&gt; (static utility class), &lt;strong&gt;Adapter&lt;/strong&gt; (superclass constructor changed), and &lt;strong&gt;Wrapper&lt;/strong&gt;
(new interface in superclass constructor). Every compat change is marked &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;// #api252&lt;/code&gt; to indicate
the last version requiring it — makes cleanup obvious when that version is dropped.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/aws/aws-toolkit-jetbrains&quot;&gt;aws-toolkit-jetbrains&lt;/a&gt; uses a simpler variant:
versioned source directories &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-243-253/&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-261+/&lt;/code&gt; within a single Gradle project.
When &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PyAddSdkPanel&lt;/code&gt; was removed in IJ 261, its usage moved to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-243-253/&lt;/code&gt; and the replacement
went into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-261+/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The critical discipline constraint&lt;/strong&gt;: all classes across version-specific directories must
expose the &lt;em&gt;same&lt;/em&gt; public API surface. You can only change implementations, not add or remove
methods. This is hard to enforce without tooling and compounds in cost as the number of shimmed
APIs grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes this better than reflection&lt;/strong&gt;: Shim implementations are compiled. API removal in a
shim generates a compile error, not a runtime exception. IDE tooling works normally inside shims.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes this incomplete&lt;/strong&gt;: Main plugin code is still compiled against one SDK only. An API
removed in a newer IJ generates no compile error unless it is already behind a shim.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/HaxeFoundation/intellij-haxe&quot;&gt;HaxeFoundation/intellij-haxe&lt;/a&gt; plugin tried
this pattern and eventually dropped it: &lt;em&gt;“maintaining a code base that can compile to multiple
versions is a lot of work.”&lt;/em&gt; They now track only the latest IJ version. The discipline required
is real, and for a small team it compounds quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Right tool for specific known API divergences. Not a whole-project strategy.
Apply reactively when Option 7 surfaces a specific API break.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;option-5-template-build-scripts&quot;&gt;Option 5: Template Build Scripts&lt;/h3&gt;

&lt;p&gt;Each IJ version gets a fully self-contained &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build.gradle.kts&lt;/code&gt; with all version constants inlined.
No convention plugins. No shared &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buildSrc&lt;/code&gt; task classes. Each file can be read top-to-bottom
and the entire build is understandable.&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;// ── Version declarations ──────────────────────────────────────────────&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;ijMajor&lt;/span&gt;      &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;253&lt;/span&gt;                 &lt;span class=&quot;c1&quot;&gt;// IJ major build number&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;ijSdk&lt;/span&gt;        &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;253-EAP-SNAPSHOT&quot;&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;// SDK artifact version&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;kotlinLang&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;2.2&quot;&lt;/span&gt;               &lt;span class=&quot;c1&quot;&gt;// Kotlin language/API level&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;needsNightly&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;false&lt;/span&gt;               &lt;span class=&quot;c1&quot;&gt;// true = requires JetBrains VPN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A configuration-time assertion catches obvious mismatches before SDK downloads:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toMajorMinor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Pair&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getOrNull&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;?.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toIntOrNull&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getOrNull&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;?.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toIntOrNull&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;kgpVersion&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extensions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getByType&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;KotlinJvmProjectExtension&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;coreLibrariesVersion&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;pMaj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;pMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kgpVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toMajorMinor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;lMaj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;lMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kotlinLang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toMajorMinor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;check&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pMaj&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lMaj&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pMaj&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lMaj&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pMin&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;[ij-$ijMajor] KGP $kgpVersion is older than declared language level $kotlinLang. &quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;+&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;Raise kotlinJvmPluginVersion in gradle.properties.&quot;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Adding a new version: copy the file, change the three constants. No convention-plugin archaeology.
New contributors can understand the entire build by reading one file.&lt;/p&gt;

&lt;p&gt;This is really an organizational principle for Option 1 — or for the single-invocation CI
pattern — rather than a standalone multi-version strategy. But it is worth naming explicitly because getting the
build organization right makes all other options more maintainable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Essential organizational discipline. Use inside whichever multi-version strategy
you choose. Works especially well combined with Option 1.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;option-6-simplify-intellijs-api-surface-long-term&quot;&gt;Option 6: Simplify IntelliJ’s API Surface (Long-term)&lt;/h3&gt;

&lt;p&gt;For MCP Steroid, the IJ plugin ultimately does two things:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Execute code: run a Kotlin snippet against the IDE’s JVM at the AI Agent’s request&lt;/li&gt;
  &lt;li&gt;Show a confirmation dialogue when the Agent wants to do something irreversible&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If these two APIs were built into IntelliJ Platform itself, the plugin would become trivial.
All real logic moves to the CLI side. The CLI deals with port discovery, plugin updates, and
versioning. The plugin just exposes two endpoints.&lt;/p&gt;

&lt;p&gt;This is the direction MCP Steroid is already moving. The CLI gets richer; the plugin gets simpler.
Every feature that migrates from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ij-plugin/&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin-cli/&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mcp-core/&lt;/code&gt; removes one class
of API compatibility problem.&lt;/p&gt;

&lt;p&gt;The platform’s RPC layer (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;com.intellij.platform.rpc&lt;/code&gt;) became public in 2025.3 and 2026.1, but
it serves the IDE-internal frontend/backend split for remote development — not external CLI
delegation. There is no announced timeline for a public &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;POST /execute&lt;/code&gt; API. Not before IJ 2026.2,
which is roughly a year away.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical implication&lt;/strong&gt;: Architect for this direction now. Every refactoring that moves logic
out of the IJ plugin is a step in the right direction, regardless of whether Option 6 itself ships.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Correct long-term direction. Not a solution for the next 12 months.&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;option-7-docker-based-build-compatibility-tests&quot;&gt;Option 7: Docker-based Build Compatibility Tests&lt;/h3&gt;

&lt;p&gt;Keep the main build simple — one version per Gradle invocation, selected by a property. Add a
separate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test-integration&lt;/code&gt; module with JUnit 5 tests that run
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew buildPlugin -Pmcp.platform.version=$version&lt;/code&gt; inside a Docker container per target IJ version.&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;`build&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plugin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;IntelliJ&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2025_3&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;buildPluginWithVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;2025.3&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;`build&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plugin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;IntelliJ&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2026_1&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;buildPluginWithVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;2026.1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@Test&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;`build&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plugin&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;IntelliJ&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2026_2&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;EAP`&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;buildPluginWithVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;262-EAP-SNAPSHOT&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;EAP versions use the &lt;strong&gt;3-digit build-number&lt;/strong&gt; format (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;262-EAP-SNAPSHOT&lt;/code&gt;, not &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2026.2-EAP-SNAPSHOT&lt;/code&gt;),
resolved from the public &lt;a href=&quot;https://github.com/JetBrains/intellij-platform-gradle-plugin/blob/12b993e2a56a66c6fdde72deb0bebb02a1635622/src/main/kotlin/org/jetbrains/intellij/platform/gradle/Constants.kt#L252&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;snapshots()&lt;/code&gt;&lt;/a&gt;
Maven repo (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;https://www.jetbrains.com/intellij-repository/snapshots&lt;/code&gt;) via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;useInstaller = false&lt;/code&gt;.
Nightly builds (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;262-SNAPSHOT&lt;/code&gt;, or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LATEST-TRUNK-SNAPSHOT&lt;/code&gt; for the rolling trunk tip) require the
&lt;a href=&quot;https://github.com/JetBrains/intellij-platform-gradle-plugin/blob/12b993e2a56a66c6fdde72deb0bebb02a1635622/src/main/kotlin/org/jetbrains/intellij/platform/gradle/Constants.kt#L250&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;nightly()&lt;/code&gt;&lt;/a&gt;
repo (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;https://www.jetbrains.com/intellij-repository/nightly&lt;/code&gt;) and IJPGP ≥ 2.14.0.
Released versions (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;2025.3&quot;&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;2026.1&quot;&lt;/code&gt;) use the default installer mode.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LATEST-TRUNK-SNAPSHOT&lt;/code&gt; is particularly valuable: it automatically follows the current trunk
regardless of major version — when 262 ships and 263 trunk starts, it tracks 263. This gives
1–3 months of early warning before API breaks appear in a numbered EAP.&lt;/p&gt;

&lt;p&gt;Each test:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Updates a bare git clone of the current repo into a cached workspace via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git fetch&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Syncs &lt;strong&gt;uncommitted&lt;/strong&gt; local changes via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync --delete&lt;/code&gt; — so local work in progress is
tested, not just what is committed&lt;/li&gt;
  &lt;li&gt;Runs &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew buildPlugin -Pmcp.platform.version=X&lt;/code&gt; inside the container&lt;/li&gt;
  &lt;li&gt;Fails the JUnit test if the build fails&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;the-caching-layer-is-what-makes-this-practical&quot;&gt;The caching layer is what makes this practical&lt;/h3&gt;

&lt;p&gt;A cold Docker build downloads a full IJ SDK — roughly 3.3 GB extracted. Slow the first time.
Three-layer caching makes subsequent runs fast:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Cache layer&lt;/th&gt;
      &lt;th&gt;Host path&lt;/th&gt;
      &lt;th&gt;What is stored&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Bare git clone&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build/build-compat/repo-cache/&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Updated via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git fetch&lt;/code&gt;; fast local clone&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Per-version workspace&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build/build-compat/workspace/&amp;lt;version&amp;gt;/&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Gradle incremental state&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Shared Gradle home&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build/build-compat/gradle-cache/&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;IJ SDK JARs, dependency downloads&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;With warm caches: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rsync&lt;/code&gt; copies only changed files, Gradle runs incrementally, the build
completes in minutes instead of the initial 10–20 minutes.&lt;/p&gt;

&lt;h3 id=&quot;why-this-fits-mcp-steroid-best&quot;&gt;Why this fits mcp-steroid best&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;The single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ij-plugin/build.gradle.kts&lt;/code&gt; already uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-Pmcp.platform.version=2025.3&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;There is already a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test-integration&lt;/code&gt; module&lt;/li&gt;
  &lt;li&gt;Adding a new IJ version is one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@Test&lt;/code&gt; method&lt;/li&gt;
  &lt;li&gt;The main build stays fast — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew build&lt;/code&gt; is one version, no Docker overhead&lt;/li&gt;
  &lt;li&gt;API breaks surface as build failures in CI, at test time rather than local build time&lt;/li&gt;
  &lt;li&gt;Parallelism is at the JUnit level — multiple containers can run simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The tradeoff&lt;/strong&gt;: Compile errors only appear when running integration tests, not in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew build&lt;/code&gt;.
For CI this is fine. For local development you rely on the IDE’s inspections and Plugin Verifier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key gap vs. Option 1&lt;/strong&gt;: Kotlin compiler version is still global. If IJ 261 requires K2.3 and
the build is configured for K2.2, the Docker test catches it — but not the main local build.
Mitigated by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;checkBundledKotlinCompatibility&lt;/code&gt; task and explicitly pinning &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinVersion&lt;/code&gt;
per target platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict&lt;/strong&gt;: Best fit for mcp-steroid today. Already implemented as a pending change.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-single-invocation-approach-what-jetbrains-uses-internally&quot;&gt;The Single-Invocation Approach: What JetBrains Uses Internally&lt;/h2&gt;

&lt;p&gt;Some JetBrains plugins use a simpler model: one Gradle invocation builds exactly one IJ version,
selected by a system property. CI runs N parallel jobs — one per version. This is probably the
most common internal approach at JetBrains for multi-version plugin builds.&lt;/p&gt;

&lt;h3 id=&quot;one-version-per-gradle-invocation&quot;&gt;One version per Gradle invocation&lt;/h3&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./gradlew build &lt;span class=&quot;nt&quot;&gt;-DplatformVersion&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;253
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A version config data class carries all version-specific coordinates:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kd&quot;&gt;data class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PluginVersionConfig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;          &lt;span class=&quot;c1&quot;&gt;// &quot;251&quot;, &quot;252&quot;, &quot;253&quot;, &quot;261&quot;&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;ideaVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// e.g. &quot;253.27604&quot; — exact build number&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;kotlinVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ... other SDK/product version fields ...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;CI runs N parallel jobs — one per version. No symlinks needed, because only one version is built
per Gradle invocation and IDEA only sees one set of source roots at a time.&lt;/p&gt;

&lt;h3 id=&quot;source-directory-selection&quot;&gt;Source directory selection&lt;/h3&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;SourceDirectorySet&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;versionedSrcDirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;basePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getCurrentVersion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;253&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-252-since/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-253-since/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-253/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-253-until/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;s&quot;&gt;&quot;261&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-252-since/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-253-since/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-261-since/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;srcDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;src-261/$basePath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Unsupported IDE version: ${getCurrentVersion().id}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;how-it-compares-to-the-multi-subproject-poc&quot;&gt;How it compares to the multi-subproject PoC&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Aspect&lt;/th&gt;
      &lt;th&gt;Single-invocation approach&lt;/th&gt;
      &lt;th&gt;Multi-subproject PoC&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Versions per Gradle invocation&lt;/td&gt;
      &lt;td&gt;1 (flag-selected)&lt;/td&gt;
      &lt;td&gt;All (one subproject each)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Symlinks needed&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;No&lt;/strong&gt; — single invocation sees one source set&lt;/td&gt;
      &lt;td&gt;Yes — IDEA needs unique paths per module&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Source dir matching&lt;/td&gt;
      &lt;td&gt;Explicit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;when&lt;/code&gt; switch&lt;/td&gt;
      &lt;td&gt;Regex &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;src-(\d{3})(?:-(since\|until))?&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Version data&lt;/td&gt;
      &lt;td&gt;20+ fields per version&lt;/td&gt;
      &lt;td&gt;3 fields: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ijMajor&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ijSdk&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinLang&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;New version effort&lt;/td&gt;
      &lt;td&gt;Add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;when&lt;/code&gt; branch + version config object&lt;/td&gt;
      &lt;td&gt;Copy build file, change 3 constants&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Single-command full matrix&lt;/td&gt;
      &lt;td&gt;Requires N parallel CI jobs&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew build&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The intellij-rust plugin used a similar per-version properties file pattern:
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gradle-252.properties&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gradle-253.properties&lt;/code&gt;, etc., with CI passing
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-PplatformVersion=$&lt;/code&gt; to select among them.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;build-time-verification-classfile-reference-checking&quot;&gt;Build-time Verification: ClassFile Reference Checking&lt;/h2&gt;

&lt;p&gt;Beyond compilation against the right SDK, there is a verification gap: do all the classes, methods,
and fields your plugin references actually exist on the IJ classpath at runtime?&lt;/p&gt;

&lt;p&gt;The PoC implements a three-tier answer to this, built around a standalone &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:check-class-refs&lt;/code&gt;
Gradle application module that uses ByteBuddy’s shaded ASM to inspect compiled bytecode.
Each tier catches a different category of API change.&lt;/p&gt;

&lt;h3 id=&quot;tier-1--verifyclassrefs-erased-descriptor-existence-check&quot;&gt;Tier 1 — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyClassRefs&lt;/code&gt;: erased-descriptor existence check&lt;/h3&gt;

&lt;p&gt;Walks every &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.class&lt;/code&gt; file in the compiled plugin. For each method call site, extracts the
&lt;strong&gt;owner class + method name + full JVM descriptor&lt;/strong&gt; (which encodes the erased return type and
parameter types). Checks that this exact &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(owner, name, descriptor)&lt;/code&gt; exists somewhere in the
owner class’s hierarchy on the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compileClasspath&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This catches:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Class removed&lt;/strong&gt;: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;com.intellij.MissingClass&lt;/code&gt; not on classpath → &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NoClassDefFoundError&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Return type changed to a different class&lt;/strong&gt;: method &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getFrame()&lt;/code&gt; changed from returning
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;com.intellij.openapi.util.Pair&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin.Pair&lt;/code&gt; — both classes exist, but the JVM
descriptor changed from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;()Lcom/intellij/openapi/util/Pair;&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;()Lkotlin/Pair;&lt;/code&gt;, so
the call site’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;INVOKEVIRTUAL&lt;/code&gt; finds no matching method → &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NoSuchMethodError&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Parameter type changed&lt;/strong&gt;: same mechanism, different position in the descriptor&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a method is missing but a same-named method with a different descriptor is found, the
report shows the alternative:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[verifyClassRefs-261] 1 method(s) not found (descriptor mismatch or removed):
  - com.intellij.openapi.wm.WindowManager#getAllProjectFrames ()[Lcom/intellij/openapi/wm/IdeFrame;
      ↳ &apos;getAllProjectFrames&apos; exists with different descriptor: ()Ljava/util/List;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;tier-2--verifyapisignatures-generic-signature-comparison&quot;&gt;Tier 2 — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyApiSignatures&lt;/code&gt;: generic signature comparison&lt;/h3&gt;

&lt;p&gt;The erased-descriptor check has a blind spot. Java/Kotlin generics are &lt;strong&gt;erased&lt;/strong&gt; at bytecode
level: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;OldType&amp;gt;&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;NewType&amp;gt;&lt;/code&gt; both compile to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ljava/util/List;&lt;/code&gt;. The JVM links
the call successfully in both cases — but the caller may receive elements of an unexpected type
and get a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ClassCastException&lt;/code&gt; later at the use site.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world case — &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid/issues/18&quot;&gt;mcp-steroid#18&lt;/a&gt;:&lt;/strong&gt;
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StatusBarEx.getBackgroundProcessModels()&lt;/code&gt; changed its return type from
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;com.intellij.openapi.util.Pair&amp;lt;TaskInfo, ProgressModel&amp;gt;&amp;gt;&lt;/code&gt; to
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;kotlin.Pair&amp;lt;TaskInfo, ProgressModel&amp;gt;&amp;gt;&lt;/code&gt; between IJ 261 and IJ 262.
Both types erase to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;()Ljava/util/List;&lt;/code&gt; — identical JVM descriptor. The plugin compiled,
linked, and passed every standard check. At runtime on IJ 262, iterating the list and
accessing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.first&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.second&lt;/code&gt; on elements assumed to be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;c.i.o.u.Pair&lt;/code&gt; threw a
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ClassCastException&lt;/code&gt;. &lt;strong&gt;Plugin Verifier does not catch this&lt;/strong&gt; (confirmed from source at commit &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/tree/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1d14a2b&lt;/code&gt;&lt;/a&gt;: it
resolves methods by erased descriptor only; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attributes are parsed for
display in error messages but never compared for compatibility).&lt;/p&gt;

&lt;p&gt;The JVM class file format stores the un-erased generic type in a separate &lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt;
attribute&lt;/strong&gt; alongside the descriptor. This attribute is what the Kotlin compiler uses to
reconstruct generic types for type checking, but the JVM itself ignores it for method
dispatch.&lt;/p&gt;

&lt;p&gt;The solution is a two-phase workflow:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Capture&lt;/strong&gt; (run once against the base IJ version, commit the result):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./gradlew :ij-plugin:ij-253-k22:captureApiSignatures
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For every method and field call site in the plugin, this looks up the member in the IJ 253
SDK and records its &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attribute into &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ij-plugin/api-signatures.txt&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;METHOD  com.intellij.util.containers.ContainerUtil#map  (Ljava/util/Collection;...)Ljava/util/List;  (Ljava/util/Collection&amp;lt;TT;&amp;gt;;...)Ljava/util/List&amp;lt;TR;&amp;gt;;
FIELD   com.intellij.openapi.util.Pair#first             Ljava/lang/Object;                           NONE
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NONE&lt;/code&gt; means no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attribute — a raw or primitive type, nothing to compare.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verify&lt;/strong&gt; (runs on every &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew check&lt;/code&gt; for all versions):&lt;/p&gt;

&lt;p&gt;For each entry in the snapshot, looks up the same &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(owner, name, erased-descriptor)&lt;/code&gt; in the
current SDK and compares &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attributes. A change that passes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyClassRefs&lt;/code&gt;
(same erased descriptor) but changes the generic parameter is caught here:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[verifyApiSignatures-261] GENERIC SIGNATURE CHANGED:
  METHOD com.intellij.Foo#getItems ()Ljava/util/List;
    was : ()Ljava/util/List&amp;lt;Lcom/example/OldType;&amp;gt;;
    now : ()Ljava/util/List&amp;lt;Ljava/lang/String;&amp;gt;;
    → METHOD com.intellij.Foo#getItems ()Ljava/util/List;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The last line is the ready-to-paste entry for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;signature-exceptions.txt&lt;/code&gt; if you have
verified the change is safe at your call sites. Known-safe changes accumulate there over time;
everything else is a build failure.&lt;/p&gt;

&lt;p&gt;The Gradle wiring in each version subproject:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;// ij-NNN-kXY/build.gradle.kts&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;snapshotFile&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;projectDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;resolve&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;api-signatures.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;exceptionsFile&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;projectDir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;resolve&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;signature-exceptions.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;compileCP&lt;/span&gt;      &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;configurations&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;named&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;compileClasspath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;verifyApiSignatures&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;by&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tasks&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;registering&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;JavaExec&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;reportOut&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;layout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buildDirectory&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;reports/verify-api-signatures-$ijMajor.txt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;outputs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reportOut&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;argumentProviders&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;CommandLineArgumentProvider&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;buildList&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--mode&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;verify&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compileCP&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;forEach&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--classpath&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;absolutePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--snapshot&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;   &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;snapshotFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;absolutePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exceptionsFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--exceptions&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exceptionsFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;absolutePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--report&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reportOut&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;absolutePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;onlyIf&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;snapshotFile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tasks&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;named&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;check&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;dependsOn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;verifyClassRefs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verifyApiSignatures&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What this still cannot catch&lt;/strong&gt;: a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attribute that was &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;null&lt;/code&gt; in both SDK versions
(method returning a raw &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&lt;/code&gt; with no generics at all). There is nothing to compare. This is
not a regression — the call was already unchecked by the compiler. The only tool that would
catch such type changes is recompilation against the new SDK, which is exactly what the
multi-subproject build (Option 1) provides.&lt;/p&gt;

&lt;h3 id=&quot;tier-3--plugin-verifier-full-jls-binary-compatibility&quot;&gt;Tier 3 — Plugin Verifier: full JLS binary compatibility&lt;/h3&gt;

&lt;p&gt;Plugin Verifier (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyPlugin&lt;/code&gt; task from IJPGP) is the most thorough check. It verifies
binary compatibility against the full JVM specification: access modifiers, method overrides,
interface contracts, deprecated/removed members.&lt;/p&gt;

&lt;p&gt;It covers cases the bytecode-ASM tiers do not:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Access narrowed&lt;/strong&gt;: a method changed from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;public&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;protected&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;internal&lt;/code&gt; — the
erased descriptor is identical, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attribute unchanged, but the JVM throws
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;IllegalAccessError&lt;/code&gt; at the call site&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Abstract method added to an interface your plugin implements&lt;/strong&gt;: the plugin class is now
missing a required implementation — &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AbstractMethodError&lt;/code&gt; at runtime&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ApiStatus.ScheduledForRemoval&lt;/code&gt; warnings&lt;/strong&gt;: flags APIs your plugin uses that will be
removed in a future IJ version, before they actually disappear&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What Plugin Verifier cannot catch — confirmed from source&lt;/strong&gt; (cloned at commit &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/tree/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1d14a2b&lt;/code&gt;&lt;/a&gt;):&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L160&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MethodResolver.kt&lt;/code&gt;&lt;/a&gt; resolves every method call using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;name + erased descriptor&lt;/code&gt; only — the
matching predicate at every call site is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;it.name == methodName &amp;amp;&amp;amp; it.descriptor == methodDescriptor&lt;/code&gt;
(lines &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L160&quot;&gt;160&lt;/a&gt;, &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L172&quot;&gt;172&lt;/a&gt;, &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L184&quot;&gt;184&lt;/a&gt;, &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L216&quot;&gt;216&lt;/a&gt;, &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L381&quot;&gt;381&lt;/a&gt;, &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/resolution/MethodResolver.kt#L391&quot;&gt;391&lt;/a&gt;). The JVM &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attribute (which
carries the un-erased generic types) is exposed on the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Method&lt;/code&gt; interface but &lt;strong&gt;only ever read
in &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/results/presentation/LocationsPresentation.kt#L143&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LocationsPresentation.kt&lt;/code&gt;&lt;/a&gt;&lt;/strong&gt; — to produce human-readable error message text, never to detect
an incompatibility. &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/intellij-plugin-verifier/verifier-core/src/main/java/com/jetbrains/pluginverifier/verifiers/instruction/TypeInstructionVerifier.kt&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TypeInstructionVerifier.kt&lt;/code&gt;&lt;/a&gt; handles &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CHECKCAST&lt;/code&gt; by resolving the target
class for existence only; there is no safety check on what the cast might actually receive.&lt;/p&gt;

&lt;p&gt;The companion &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ide-diff-builder&lt;/code&gt; tool (which computes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ApiStatus.AvailableSince&lt;/code&gt; changelogs
between two IDE builds) has the same gap: &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/ide-diff-builder/src/main/java/org/jetbrains/ide/diff/builder/api/ApiDiffBuilder.kt#L48&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ApiDiffBuilder.kt&lt;/code&gt; line 48&lt;/a&gt; keys methods on
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;it.name + it.descriptor&lt;/code&gt; — erased descriptor only. &lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/ide-diff-builder/src/main/java/org/jetbrains/ide/diff/builder/api/ApiSignature.kt#L28&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ApiSignature.MethodSignature&lt;/code&gt;&lt;/a&gt; stores the
generic &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;signature: String?&lt;/code&gt; field, but no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ApiEvent&lt;/code&gt; subclass for a signature change exists
(&lt;a href=&quot;https://github.com/JetBrains/intellij-plugin-verifier/blob/1d14a2b114a8d04bc61b2b4a642ad7d8e48d07a5/ide-diff-builder/src/main/java/org/jetbrains/ide/diff/builder/api/ApiEvent.kt&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ApiEvent.kt&lt;/code&gt;&lt;/a&gt;),
and no code ever compares the field between old and new SDK.&lt;/p&gt;

&lt;p&gt;Known pitfall with EAP builds: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pluginVerification { ides { recommended() } }&lt;/code&gt; silently resolves to zero IDEs
when &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sinceBuild&lt;/code&gt; targets an unreleased platform. IJPGP v2.12.0
(&lt;a href=&quot;https://github.com/JetBrains/intellij-platform-gradle-plugin/issues/2090&quot;&gt;issue #2090&lt;/a&gt;)
added a warning, but the check is still skipped. Always verify that the resolved IDE list is
non-empty when running against EAP builds.&lt;/p&gt;

&lt;h3 id=&quot;comparison&quot;&gt;Comparison&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyClassRefs&lt;/code&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyApiSignatures&lt;/code&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Plugin Verifier&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Class removed&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓ (via snapshot MISSING)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Method/field removed&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Return/param type changed (erased, e.g. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pair&lt;/code&gt;→&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin.Pair&lt;/code&gt;)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Generic param changed (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;A&amp;gt;&lt;/code&gt;→&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;B&amp;gt;&lt;/code&gt;, same erased descriptor)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓ ← unique value&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗ (uses erased descriptors only; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attributes read for display, never compared)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Access modifier narrowed&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Abstract method added&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ScheduledForRemoval&lt;/code&gt; warnings&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✗&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;✓&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Classpath source&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Already-resolved &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compileClasspath&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Already-resolved &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compileClasspath&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;local(intellijPlatform.platformPath.toFile())&lt;/code&gt; (no extra download) or downloads separate IDE&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Wired to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew check&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;No&lt;/strong&gt; (requires explicit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tasks.check { dependsOn(&quot;verifyPlugin&quot;) }&lt;/code&gt;)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Cold-start speed&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Seconds&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Seconds&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;~5–15 s with local SDK; 10–20 min if downloading&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Requires snapshot file&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;No&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Yes (run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;captureApiSignatures&lt;/code&gt; once against base SDK)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;No&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The intended usage: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyClassRefs&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyApiSignatures&lt;/code&gt; run on every build as a fast
first gate. Plugin Verifier runs in CI pre-release (or on demand) as the thorough audit.
A failure in the fast gate is a definite bug. A clean fast gate with a Plugin Verifier failure
means an access or override issue — the rarer category.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://github.com/JetBrains/mcp-steroid/issues/18#issuecomment-4231291065&quot;&gt;mcp-steroid#18&lt;/a&gt;&lt;/strong&gt; is the concrete case that motivated Tier 2: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;getBackgroundProcessModels()&lt;/code&gt; changing from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;c.i.o.u.Pair&amp;gt;&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;List&amp;lt;kotlin.Pair&amp;gt;&lt;/code&gt; passes both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyClassRefs&lt;/code&gt; and Plugin Verifier (same erased descriptor &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;()Ljava/util/List;&lt;/code&gt;; cast target class still exists). &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;verifyApiSignatures&lt;/code&gt; catches it by comparing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Signature&lt;/code&gt; attributes — the unique capability verified against Plugin Verifier source code.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;build-issues-i-hit-along-the-way&quot;&gt;Build Issues I Hit Along the Way&lt;/h2&gt;

&lt;p&gt;These affect any IJ plugin on IJPGP 2.x with Gradle 9.4. Worth knowing before you hit them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Missing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mavenCentral()&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;repositories {}&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;intellijPlatform { defaultRepositories() }&lt;/code&gt; adds JetBrains repositories only. KGP 2.3.0’s
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin-build-tools-compat&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin-build-tools-impl&lt;/code&gt; live on Maven Central. Build fails
with a confusing message:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Could not resolve kotlin-build-tools-compat:2.3.0.
No repositories are defined for configuration &apos;:ij-253-k22:compileClasspath&apos;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fix: add &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mavenCentral()&lt;/code&gt; at the outer &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;repositories {}&lt;/code&gt; level, not inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;intellijPlatform {}&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nf&quot;&gt;repositories&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;maven&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;url&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;uri&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;https://cache-redirector.jetbrains.com/maven-central&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;mavenCentral&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;intellijPlatform&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;defaultRepositories&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;snapshots&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;patchPluginXml&lt;/code&gt; → &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;linkSources&lt;/code&gt; dependency missing (Gradle 9.4 strict mode)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The IJ Platform Gradle Plugin’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;patchPluginXml&lt;/code&gt; resolves &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plugin.xml&lt;/code&gt; from source roots that
are populated by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;linkSources&lt;/code&gt;. Gradle 9.4 strict task-dependency validation rejects this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Task &apos;patchPluginXml&apos; uses output of task &apos;linkSources&apos; without declaring an explicit dependency.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fix:&lt;/p&gt;
&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;tasks&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;matching&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;patchPluginXml&quot;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;configureEach&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;dependsOn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;linkSources&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin.stdlib.default.dependency=false&lt;/code&gt; breaks filename-based stdlib detection&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This property prevents KGP from adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin-stdlib-X.Y.Z.jar&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compileClasspath&lt;/code&gt; as a
standalone artifact. It is required for IJ plugins (otherwise you bundle stdlib alongside the
bundled one), but it breaks any task that finds the Kotlin version by searching &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compileClasspath&lt;/code&gt;
for a file matching &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin-stdlib*.jar&lt;/code&gt;. The stdlib lives inside the bundled Kotlin plugin JARs
in the IJ SDK — not as a recognizable standalone JAR.&lt;/p&gt;

&lt;p&gt;Note: for standalone application modules that are NOT IJ plugins (like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;check-class-refs&lt;/code&gt;), you
need &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;implementation(kotlin(&quot;stdlib&quot;))&lt;/code&gt; explicitly, because this property applies globally and
those modules do need their own stdlib.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;scoring-summary&quot;&gt;Scoring Summary&lt;/h2&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Approach&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Compile Safety&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Build Simplicity&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;CI Speed&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Dev Speed&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Maintenance&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;mcp-steroid fit&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Total&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1 — Multi-subproject + symlinks&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;18&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2 — Branch per version&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;16&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3 — Reflection&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;19&lt;/strong&gt; ❌&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4 — Shims / typed adapters&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;20&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5 — Template build scripts&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;20&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;6 — HTTP API in IJ (long-term)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;27&lt;/strong&gt; ⏳&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;7 — Docker compat tests&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;3&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;5&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;3&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;5&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;5&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;5&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;26&lt;/strong&gt; ✅&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;single-invocation (JetBrains internal)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;&lt;strong&gt;23&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Dimensions scored 1–5: Compile-time safety (do API breakages surface at build time across all
supported versions?), Build simplicity (cognitive load to add a new IJ version), CI speed on warm
cache, Dev loop speed for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew build&lt;/code&gt;, Maintenance per new IJ major, and fit for
MCP Steroid’s current single-project structure.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;what-jetbrains-could-do-better&quot;&gt;What JetBrains Could Do Better&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;A prominent, versioned Kotlin compatibility matrix&lt;/strong&gt;: The information exists in IJPGP’s
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PlatformKotlinVersions&lt;/code&gt; map and on the
&lt;a href=&quot;https://plugins.jetbrains.com/docs/intellij/using-kotlin.html&quot;&gt;using-kotlin.html&lt;/a&gt; page, but
the map was missing 261 until IJPGP v2.12.0 — causing cryptic failures for everyone targeting
261 EAP. A prominently linked table that stays ahead of each EAP release would save many hours.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A multi-version build template&lt;/strong&gt;: The
&lt;a href=&quot;https://github.com/JetBrains/intellij-platform-plugin-template&quot;&gt;IntelliJ Platform Plugin Template&lt;/a&gt;
is excellent for single-version plugins. A companion template showing the single-invocation approach or the Docker compat-test approach
would let teams start from a working multi-version structure rather than improvising from
scattered blog posts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plugin Verifier wired to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;check&lt;/code&gt; by default&lt;/strong&gt;: Plugin Verifier is the most thorough binary compat
check available, but it requires explicit invocation. The silent failure of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;recommended()&lt;/code&gt; against
unreleased platforms — where verification appears to pass but zero IDEs were checked — means
many plugins ship without ever running the tool. Making it a default-on gate (with the
resolved-IDEs warning visible) would catch more regressions earlier.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;where-to-start&quot;&gt;Where to Start&lt;/h2&gt;

&lt;p&gt;Different teams come to this problem from different starting points.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Building a new plugin from scratch&lt;/strong&gt;: Start with Option 7 — Docker build compatibility tests.
Zero migration cost, CI coverage across your target versions from day one. When tests surface
a specific API divergence, add an Option 4 shim for that API only. Use Option 5 template build
scripts to keep each version’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build.gradle.kts&lt;/code&gt; readable top-to-bottom.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Existing single-version plugin expanding to additional IDE versions&lt;/strong&gt;: Option 7 is the
lowest-friction path. The main build stays unchanged — you are adding CI visibility, not
restructuring the project. Option 4 shims come next, reactively, as Docker tests fail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Already on a multi-subproject setup&lt;/strong&gt;: The PoC documents
specific gaps worth watching — the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;checkVersionCoverage&lt;/code&gt; task for version range completeness,
the ByteBuddy class-ref verifier as a fast &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew check&lt;/code&gt; gate, and the Gradle 9.4
strict-mode dependency issues that affect any multi-project build upgrading to Gradle 9.x.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;what-i-am-doing-for-mcp-steroid&quot;&gt;What I Am Doing for MCP Steroid&lt;/h2&gt;

&lt;p&gt;Short-term:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Ship Option 7&lt;/strong&gt; — the Docker compat tests. Already implemented as a pending change. Zero
migration cost; the main build stays fast; IJ 2025.3, 2026.1, and 2026.2 EAP are covered.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Pin K2.3 for IJ 261 builds&lt;/strong&gt; — explicitly set &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinVersion = &quot;2.3.20&quot;&lt;/code&gt; when
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-Pmcp.platform.version=2026.1&lt;/code&gt;. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;checkBundledKotlinCompatibility&lt;/code&gt; task catches mismatches.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Option 4 reactively&lt;/strong&gt; — if the Docker tests surface a specific API break, add a shim for that
API only. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;listProjectWindows&lt;/code&gt; is the first candidate. No pre-built shims for hypothetical
future breaks.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Keep moving logic to the CLI&lt;/strong&gt; — every feature migrated from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ij-plugin/&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlin-cli/&lt;/code&gt;
or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mcp-core/&lt;/code&gt; removes a class of version-compatibility problem. The plugin becomes thinner;
the version-sensitive surface shrinks. Option 6 is a year away, but the architectural direction
is correct and every step toward it pays off now.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;em&gt;The multi-version PoC is at &lt;a href=&quot;https://github.com/jonnyzzz/ij-multi-version&quot;&gt;github.com/jonnyzzz/ij-multi-version&lt;/a&gt;.
MCP Steroid is at &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;github.com/JetBrains/mcp-steroid&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Building an IJ plugin that spans multiple major IDE versions? I hope this saves you some time.
Reach out on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or &lt;a href=&quot;https://github.com/jonnyzzz&quot;&gt;GitHub&lt;/a&gt;
if something here needs correcting, you need my help, or if you have found a better approach.&lt;/em&gt;&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="intellij" />
  
    <category term="kotlin" />
  
    <category term="gradle" />
  
    <category term="plugin-development" />
  
    <category term="mcp-steroid" />
  
    <category term="kotlin-coroutines" />
  
    <category term="jvm" />
  
    <category term="bytebuddy" />
  
    <summary type="html">Building an IntelliJ plugin that works across IJ 2025.3 and 2026.1 is not a plugin.xml configuration problem. It is a Kotlin version problem, a classpath coupling problem, and a build architecture problem simultaneously. I explored seven approaches, built a PoC, and documented everything that went wrong.</summary>
  
  </entry>
  
  <entry>
    <title type="html">Booting NVIDIA Jetson AGX Thor: Headless, from macOS, via Serial Console</title>
    <link href="https://jonnyzzz.com/blog/2026/04/09/jetson-agx-thor-setup/" rel="alternate" type="text/html" title="Booting NVIDIA Jetson AGX Thor: Headless, from macOS, via Serial Console" />
    <published>2026-04-09T00:00:00+00:00</published>
    <updated>2026-04-09T00:00:00+00:00</updated>
    <id>/blog/2026/04/09/jetson-agx-thor-setup</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/04/09/jetson-agx-thor-setup/">&lt;p&gt;A small, heavy box arrived at the desk. Inside: an NVIDIA Jetson AGX Thor Developer Kit.
No monitor. No keyboard. No display port handy. The task: flash JetPack 7.0, configure it
headlessly from macOS, and get Docker and Tailscale running. Simple enough on paper.&lt;/p&gt;

&lt;p&gt;It turns out there is one non-obvious failure mode that is not well documented. The OOBE
wizard — the first-boot configuration screen — does not appear on the serial port you expect.
NVIDIA’s own forum has a thread about it. After hitting the same wall, we worked through the
workaround. This post is the reference we wish had existed.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/posts/2026-04-09-jetson-agx-thor-setup.png&quot; alt=&quot;NVIDIA Jetson AGX Thor Developer Kit&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;the-hardware&quot;&gt;The Hardware&lt;/h2&gt;

&lt;p&gt;The Jetson AGX Thor Developer Kit carries the T5000 SoM with a Blackwell GPU — the same
architecture as the DGX Spark (GB10). The core specs that matter for AI workloads are
strikingly close between the two boxes:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Component&lt;/th&gt;
      &lt;th&gt;Jetson AGX Thor (T5000)&lt;/th&gt;
      &lt;th&gt;DGX Spark (GB10)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;GPU architecture&lt;/td&gt;
      &lt;td&gt;Blackwell&lt;/td&gt;
      &lt;td&gt;Grace Blackwell&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Unified memory&lt;/td&gt;
      &lt;td&gt;128 GB LPDDR5x&lt;/td&gt;
      &lt;td&gt;128 GB LPDDR5x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Memory bandwidth&lt;/td&gt;
      &lt;td&gt;276 GB/s&lt;/td&gt;
      &lt;td&gt;273 GB/s&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;AI performance&lt;/td&gt;
      &lt;td&gt;1,035 TOPS (FP8)&lt;/td&gt;
      &lt;td&gt;1,000 TOPS&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;CPU&lt;/td&gt;
      &lt;td&gt;14-core Arm Neoverse&lt;/td&gt;
      &lt;td&gt;20-core Arm (Cortex-X925/A725)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Power&lt;/td&gt;
      &lt;td&gt;40–130 W&lt;/td&gt;
      &lt;td&gt;~60 W&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Price&lt;/td&gt;
      &lt;td&gt;$3,499&lt;/td&gt;
      &lt;td&gt;$4,699&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Primary target&lt;/td&gt;
      &lt;td&gt;Edge / robotics / physical AI&lt;/td&gt;
      &lt;td&gt;Desktop LLM inference&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The 128 GB unified memory is the number that matters most for LLM inference — it determines
the maximum model size you can load. Both boxes hold the same amount, so they can run
similarly-sized models. Compute performance is also in the same range.&lt;/p&gt;

&lt;p&gt;Where they differ: the &lt;a href=&quot;https://docs.nvidia.com/dgx/dgx-spark/hardware.html&quot;&gt;DGX Spark&lt;/a&gt; has a stronger CPU subsystem and its software stack
(JetPack included) is more tuned for large model serving out of the box. The Thor is built
for edge deployment — robotics, physical AI, sensor pipelines — with different peripherals
(GPIO, camera inputs, QSFP28 networking) and a wider power envelope (up to 130 W peak,
as low as 40 W idle). Think of the Spark as a desktop inference server; the Thor as a
compute node that ships inside a robot or a rack at the edge.&lt;/p&gt;

&lt;p&gt;For our purposes — running inference workloads on a local GPU node — the Thor is a
perfectly capable alternative to the Spark, and at $1,200 less.&lt;/p&gt;

&lt;p&gt;After the software setup below, the Thor we configured looks like this:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Component&lt;/th&gt;
      &lt;th&gt;Detail&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Device&lt;/td&gt;
      &lt;td&gt;NVIDIA Jetson AGX Thor Developer Kit (T5000 SoM)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;OS&lt;/td&gt;
      &lt;td&gt;Ubuntu 24.04.3 LTS&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Kernel&lt;/td&gt;
      &lt;td&gt;6.8.12-tegra&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Jetson Linux (L4T)&lt;/td&gt;
      &lt;td&gt;R38, REVISION 4.0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NVIDIA Driver&lt;/td&gt;
      &lt;td&gt;580.00&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;CUDA&lt;/td&gt;
      &lt;td&gt;13.0&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The I/O side of the board has a row of ports. From left to right (roughly):
two USB-A 3.2 ports, USB-C 5a (recovery mode, PD Sink 140W), USB-C 5b (data, PD Sink 140W),
DisplayPort, HDMI, an RJ45 at 5 Gbps, and a QSFP28 quad-port at 25 Gbps each. Behind the
magnetic lid cover on the top of the device: the &lt;strong&gt;Debug-USB&lt;/strong&gt; port, the one with the serial
console.&lt;/p&gt;

&lt;p&gt;That lid cover is where the setup begins.&lt;/p&gt;

&lt;h2 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;USB flash drive, &lt;strong&gt;16 GB or larger&lt;/strong&gt; (we used a 61.5 GB SanDisk)&lt;/li&gt;
  &lt;li&gt;Mac with &lt;strong&gt;≥25 GB&lt;/strong&gt; free storage&lt;/li&gt;
  &lt;li&gt;USB-C cable (one is enough)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;picocom&lt;/code&gt; installed: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;brew install picocom&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;The JetPack ISO (3.9 GB download, see below)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;step-1-download-the-iso&quot;&gt;Step 1: Download the ISO&lt;/h2&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl &lt;span class=&quot;nt&quot;&gt;-L&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; jetson-thor-r38.4.iso &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
  &lt;span class=&quot;s2&quot;&gt;&quot;https://developer.nvidia.com/downloads/embedded/L4T/r38_Release_v4.0/release/jetsoninstaller-r38.4.0-2025-12-30-17-00-37-arm64.iso&quot;&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# 3.9 GB, 4,207,276,032 bytes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The ISO is on NVIDIA’s &lt;a href=&quot;https://developer.nvidia.com/embedded/jetpack/downloads&quot;&gt;JetPack downloads page&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;step-2-write-the-iso-to-usb-macos&quot;&gt;Step 2: Write the ISO to USB (macOS)&lt;/h2&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;diskutil list external  &lt;span class=&quot;c&quot;&gt;# find your USB drive, e.g. /dev/disk4&lt;/span&gt;
diskutil unmountDisk /dev/disk4
&lt;span class=&quot;nb&quot;&gt;sudo dd &lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;jetson-thor-r38.4.iso &lt;span class=&quot;nv&quot;&gt;of&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;/dev/rdisk4 &lt;span class=&quot;nv&quot;&gt;bs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;4m
diskutil eject /dev/disk4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We wrote 4,207,276,032 bytes in 45.7 seconds (~92 MB/s). After &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dd&lt;/code&gt; completes, macOS will
show a dialog: &lt;em&gt;“The disk you inserted was not readable.”&lt;/em&gt; Click &lt;strong&gt;Eject&lt;/strong&gt; or &lt;strong&gt;Ignore&lt;/strong&gt; —
the Jetson ISO9660 filesystem is not macOS-compatible by design. The USB stick is fine.&lt;/p&gt;

&lt;h2 id=&quot;step-3-connect-the-serial-console&quot;&gt;Step 3: Connect the Serial Console&lt;/h2&gt;

&lt;p&gt;Open the magnetic lid cover on the top of the Thor (see the picture above). Connect a USB-C cable from your Mac to
the &lt;strong&gt;Debug-USB port&lt;/strong&gt; inside.&lt;/p&gt;

&lt;p&gt;macOS exposes &lt;strong&gt;four&lt;/strong&gt; serial devices for this port. Different boot stages use different ones:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;macOS device suffix&lt;/th&gt;
      &lt;th&gt;Boot stage&lt;/th&gt;
      &lt;th&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/dev/cu.usbmodem…B4&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;UEFI / GRUB menu&lt;/td&gt;
      &lt;td&gt;This is the interactive one&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/dev/cu.usbmodem…B2&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Kernel dmesg&lt;/td&gt;
      &lt;td&gt;Useful to watch progress&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/dev/cu.usbmodem…B?&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Other subsystems&lt;/td&gt;
      &lt;td&gt;Usually quiet during install&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Open two terminal tabs:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Tab 1 — GRUB and UEFI interaction&lt;/span&gt;
picocom &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; 115200 /dev/cu.usbmodemTOPOA735A12B4

&lt;span class=&quot;c&quot;&gt;# Tab 2 — kernel messages&lt;/span&gt;
picocom &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; 115200 /dev/cu.usbmodemTOPOA735A12B2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Your device names will differ from ours — the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TOPOA735A12&lt;/code&gt; part comes from the board serial
number. Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ls /dev/cu.usbmodem*&lt;/code&gt; to see what appeared after connecting the cable.&lt;/p&gt;

&lt;p&gt;To exit picocom when you are done: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ctrl-A&lt;/code&gt;, then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ctrl-X&lt;/code&gt;. Set your terminal size
to &lt;strong&gt;242×61&lt;/strong&gt; for proper display — especially relevant during the OOBE wizard later.&lt;/p&gt;

&lt;p&gt;Two notes from hard experience:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/dev/cu.debug-console&lt;/code&gt; does &lt;strong&gt;not&lt;/strong&gt; work for any stage. Ignore it.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;screen&lt;/code&gt; has reliability issues with these devices. Use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;picocom&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;step-4-boot-from-usb-and-flash-to-nvme&quot;&gt;Step 4: Boot from USB and Flash to NVMe&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;Insert the USB stick into one of the &lt;strong&gt;USB-A&lt;/strong&gt; ports on the I/O side of the Thor&lt;/li&gt;
  &lt;li&gt;Power on (press the power button)&lt;/li&gt;
  &lt;li&gt;In &lt;strong&gt;Tab 1&lt;/strong&gt; (B4): press Enter at the pre-boot prompt, wait for the GRUB menu&lt;/li&gt;
  &lt;li&gt;In the GRUB menu, select:
&lt;strong&gt;“Flash Jetson AGX Thor Developer Kit on NVMe r38.4.0”&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
  &lt;p&gt;The GRUB menu also offers a “USB” flash target — do not use that. It installs onto
the USB stick itself, not the internal NVMe SSD.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;After you confirm the selection, switch to &lt;strong&gt;Tab 2&lt;/strong&gt; (B2). The kernel will print boot messages
for a minute, then go silent. That silence lasts about &lt;strong&gt;10 minutes&lt;/strong&gt;. This is the actual
installation. The installer output goes to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ttyUTC0&lt;/code&gt; (a hardware UART, not accessible from
the USB serial ports), so there is nothing to watch. This is normal. Do not power-cycle.&lt;/p&gt;

&lt;p&gt;Installation is complete when Tab 2 shows:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[   12.260117] nvidia-modeset: WARNING: HW supports 8 heads. Limiting to 4 heads
[   12.260117] Please complete NVIDIA OOBE on the serial port provided by Jetson&apos;s USB device mode connection. e.g.
  /dev/ttyACMx where x can 0, 1, 2 etc.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ttyACMx&lt;/code&gt; hint in the message — that is a Linux device path. On macOS there is no
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ttyACM&lt;/code&gt;. Ignore that hint. The correct macOS device appears after the cable swap below.&lt;/p&gt;

&lt;p&gt;Remove the USB stick. Do not power off yet.&lt;/p&gt;

&lt;h2 id=&quot;step-5-the-oobe-trap&quot;&gt;Step 5: The OOBE Trap&lt;/h2&gt;

&lt;p&gt;This is the part that caught us. The message above says to complete OOBE on &lt;em&gt;“the serial port
provided by Jetson’s USB device mode connection.”&lt;/em&gt; On the Debug-USB port (the one behind the lid),
the OOBE wizard does &lt;strong&gt;not&lt;/strong&gt; appear. Nothing shows. You wait. Still nothing.&lt;/p&gt;

&lt;p&gt;This is a known JetPack 7.0 limitation. NVIDIA has acknowledged it on the developer forum
(&lt;a href=&quot;https://forums.developer.nvidia.com/t/first-boot-headless-mode-asks-for-login-password/351064&quot;&gt;thread here&lt;/a&gt;) with a fix targeted for 7.1. The workaround:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Disconnect&lt;/strong&gt; the USB-C cable from the Debug-USB port (behind the lid)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Reconnect&lt;/strong&gt; the same cable to the &lt;strong&gt;USB-C port 5b&lt;/strong&gt; on the I/O side (the middle USB-C port,
next to the two USB-A ports)&lt;/li&gt;
  &lt;li&gt;A new tty device appears on macOS. It keeps the same board-serial prefix as the Debug-USB
devices but with a different suffix — on our board it was &lt;strong&gt;B6&lt;/strong&gt;:
    &lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;ls&lt;/span&gt; /dev/tty.usbmodem&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;    &lt;span class=&quot;c&quot;&gt;# find the new device&lt;/span&gt;
picocom &lt;span class=&quot;nt&quot;&gt;-b&lt;/span&gt; 115200 /dev/tty.usbmodemTOPOA735A12B6
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;The OOBE wizard appears in &lt;strong&gt;text mode&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Step through the wizard:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Accept the EULA&lt;/li&gt;
  &lt;li&gt;Create a user account&lt;/li&gt;
  &lt;li&gt;Set the hostname&lt;/li&gt;
  &lt;li&gt;Configure network — select &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;enP2p1s0&lt;/code&gt; for the RJ45 Ethernet&lt;/li&gt;
  &lt;li&gt;Set timezone&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After completing OOBE the device reboots into a fresh Ubuntu 24.04.3 system.&lt;/p&gt;

&lt;h2 id=&quot;step-6-ssh-setup&quot;&gt;Step 6: SSH Setup&lt;/h2&gt;

&lt;p&gt;Once the device is on the network, set up key-based SSH from macOS so you never need the
serial console again.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Generate a dedicated key&lt;/span&gt;
ssh-keygen &lt;span class=&quot;nt&quot;&gt;-t&lt;/span&gt; ed25519 &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; ~/.ssh/thor-04 &lt;span class=&quot;nt&quot;&gt;-N&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-C&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;jonnyzzz@thor-04&quot;&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Upload the key (first time needs the password)&lt;/span&gt;
expect &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EXPECT&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;
set pub [exec cat ~/.ssh/thor-04.pub]
spawn ssh -o StrictHostKeyChecking=no -o PubkeyAuthentication=no jetbrains@thor-04 &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;
  &quot;mkdir -p ~/.ssh &amp;amp;&amp;amp; echo &apos;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$pub&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos; &amp;gt;&amp;gt; ~/.ssh/authorized_keys &amp;amp;&amp;amp; chmod 700 ~/.ssh &amp;amp;&amp;amp; chmod 600 ~/.ssh/authorized_keys&quot;
expect &quot;*assword*&quot;
send &quot;jetbrains&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\r&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;
expect eof
&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;EXPECT
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Add to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/.ssh/config&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Host thor-04 thor-04.local
    HostName thor-04
    User jetbrains
    IdentityFile ~/.ssh/thor-04
    IdentitiesOnly yes
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Test it: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ssh thor-04&lt;/code&gt; should land you at a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Linux thor-04 6.8.12-tegra aarch64&lt;/code&gt; prompt
without a password.&lt;/p&gt;

&lt;h2 id=&quot;step-7-post-install-configuration&quot;&gt;Step 7: Post-Install Configuration&lt;/h2&gt;

&lt;p&gt;With SSH working, the rest is straightforward over the network:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Passwordless sudo:&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ssh thor-04
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;usermod &lt;span class=&quot;nt&quot;&gt;-aG&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;jetbrains
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;jetbrains ALL=(ALL) NOPASSWD: ALL&quot;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;sudo tee&lt;/span&gt; /etc/sudoers.d/jetbrains
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Basic tools:&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; curl mc vim
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Tailscale&lt;/strong&gt; (if you want the device on your VPN):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;curl &lt;span class=&quot;nt&quot;&gt;-fsSL&lt;/span&gt; https://tailscale.com/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Docker:&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get update &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;apt-get &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; docker.io
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;usermod &lt;span class=&quot;nt&quot;&gt;-aG&lt;/span&gt; docker jetbrains
&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;systemctl &lt;span class=&quot;nb&quot;&gt;enable &lt;/span&gt;docker &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;systemctl restart docker
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After a logout/login cycle, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker ps&lt;/code&gt; works without sudo. We got Docker 29.1.3.&lt;/p&gt;

&lt;h2 id=&quot;uefi-and-iso-compatibility&quot;&gt;UEFI and ISO Compatibility&lt;/h2&gt;

&lt;p&gt;One more thing worth knowing before you try to reinstall later. There is a UEFI version
dependency between the ISO revision and the firmware on the board:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;UEFI Version&lt;/th&gt;
      &lt;th&gt;ISO r38.2&lt;/th&gt;
      &lt;th&gt;ISO r38.4&lt;/th&gt;
      &lt;th&gt;Notes&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;r38.0.0 (factory)&lt;/td&gt;
      &lt;td&gt;Yes&lt;/td&gt;
      &lt;td&gt;Yes&lt;/td&gt;
      &lt;td&gt;No settings change needed&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;r38.2.x&lt;/td&gt;
      &lt;td&gt;Yes&lt;/td&gt;
      &lt;td&gt;Yes&lt;/td&gt;
      &lt;td&gt;Enable Display Handoff mode before USB&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;r38.4.x&lt;/td&gt;
      &lt;td&gt;No&lt;/td&gt;
      &lt;td&gt;Yes&lt;/td&gt;
      &lt;td&gt;Cannot downgrade to an older ISO&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;If you flashed once with r38.4.x and want to reinstall: you must use the r38.4 ISO.
Downgrading the UEFI is not supported.&lt;/p&gt;

&lt;p&gt;For reinstalls with UEFI r38.2.x, before inserting the USB stick, go into:
&lt;strong&gt;UEFI → Device Manager → NVIDIA Configuration → Boot Configuration
→ SOC Display Hand-Off Mode → “Auto”, Method → “efifb”.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;what-came-out-the-other-side&quot;&gt;What Came Out the Other Side&lt;/h2&gt;

&lt;p&gt;After following this path, the Thor is running:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ubuntu 24.04.3 LTS, kernel 6.8.12-tegra&lt;/li&gt;
  &lt;li&gt;CUDA 13.0, NVIDIA driver 580.00&lt;/li&gt;
  &lt;li&gt;Docker 29.1.3&lt;/li&gt;
  &lt;li&gt;Tailscale, SSH key auth, passwordless sudo&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The device is now a GPU node on our local network. Next: run inference workloads on it.
The Jetson AGX Thor with its NVIDIA Thor GPU and unified memory is a different beast
from the DGX Spark we covered in &lt;a href=&quot;/blog/2025/11/26/junie-cage-spark/&quot;&gt;earlier posts&lt;/a&gt; — smaller footprint,
ARM architecture, different memory model. The setup experience surfaced a few quirks
specific to JetPack 7.0 that we have not seen documented well elsewhere.&lt;/p&gt;

&lt;p&gt;If you run into the OOBE freeze, check the USB-C port. That one confused us for a while.&lt;/p&gt;

&lt;p&gt;Questions or corrections: reach me on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or &lt;a href=&quot;https://x.com/jonnyzzz&quot;&gt;X&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="nvidia" />
  
    <category term="jetson" />
  
    <category term="LocalAI" />
  
    <category term="hardware" />
  
    <category term="cuda" />
  
    <summary type="html">Setting up a Jetson AGX Thor Developer Kit headlessly from macOS — no monitor, no Ethernet on day one, just a USB-C cable and picocom. The OOBE wizard has a bug in JetPack 7.0 that silently drops you on the wrong serial port. Here is the working path.</summary>
  
  </entry>
  
  <entry>
    <title type="html">IntelliJ as a Skill Factory</title>
    <link href="https://jonnyzzz.com/blog/2026/04/08/mcp-steroid-skill-factory/" rel="alternate" type="text/html" title="IntelliJ as a Skill Factory" />
    <published>2026-04-08T00:00:00+00:00</published>
    <updated>2026-04-08T00:00:00+00:00</updated>
    <id>/blog/2026/04/08/mcp-steroid-skill-factory</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/04/08/mcp-steroid-skill-factory/">&lt;p&gt;Every time your agent needs to use an IntelliJ API it hasn’t seen before, it burns tokens figuring
out PSI trees and threading rules. It tries something, gets a compilation error, tries again, gets
a wrong result, tries a third time. Eventually it works. Then the next agent – or the same agent
in a new session – starts the whole cycle over.&lt;/p&gt;

&lt;p&gt;I &lt;a href=&quot;/blog/2026/04/07/mcp-steroid-open-source/&quot;&gt;open-sourced MCP Steroid&lt;/a&gt; yesterday.
That post covered the what and why. This one is about the pattern that makes the plugin actually
useful day to day: &lt;strong&gt;skills&lt;/strong&gt;.&lt;/p&gt;

&lt;h2 id=&quot;two-islands&quot;&gt;Two Islands&lt;/h2&gt;

&lt;p&gt;Here’s how I think about the problem. There are
&lt;a href=&quot;/blog/2026/03/24/agentic-experience-and-tools/&quot;&gt;two islands&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Island one&lt;/strong&gt; is the IDE world. IntelliJ platform has deep code understanding – control flow
graphs, cross-language refactorings, structural search, a debugger, Gradle integration, Spring Integration, and the long-tail more of the features. 
It’s an
extraordinary amount of engineering. But it was all built for humans clicking menus. IntelliJ IDEs
ship with a built-in MCP server since 2025.2, but it exposes a curated, fixed set of operations –
file edits, run configurations, basic code actions. A start, but not the full picture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Island two&lt;/strong&gt; is the agentic world. AI agents that can reason, plan, iterate, and write code –
but they’re stuck either running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grep&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sed&lt;/code&gt;, or limited to predefined tool menus, never
touching the real depth of what the IDE already knows about the codebase.&lt;/p&gt;

&lt;p&gt;MCP Steroid bridges them fully. An agent sends Kotlin through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;steroid_execute_code&lt;/code&gt;, and that code
runs inside the IDE’s JVM with full access to everything IntelliJ knows. Not a curated subset.
Not a fixed menu. The actual APIs that power the IDE’s own features. No abstractions skipped.&lt;/p&gt;

&lt;h2 id=&quot;the-skill-factory&quot;&gt;The Skill Factory&lt;/h2&gt;

&lt;p&gt;Here’s the workflow that changed how I use the plugin.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase one: research.&lt;/strong&gt; The agent experiments with IntelliJ APIs via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;steroid_execute_code&lt;/code&gt;. It
tries things, fails, iterates. This phase is messy – maybe 4-8 retries, a lot of tokens spent on
compilation errors and wrong API assumptions – including
&lt;a href=&quot;/blog/2026/02/12/jvm-classloading-intellij/&quot;&gt;JVM classloading quirks&lt;/a&gt; that trip up
even experienced developers. That’s fine. The agent is learning.&lt;/p&gt;

&lt;p&gt;At that phase Agent uses the 60+ resources that are already included into the MCP Steroid package.
For really complex tasks, I also recommend adding the &lt;a href=&quot;https://github.com/JetBrains/intellij-community&quot;&gt;IntelliJ Community&lt;/a&gt; sources
and the sources of your third-party plugins and ask the agent to conduct the research there as well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase two: encapsulation.&lt;/strong&gt; Once the agent figures out how to solve the problem, the working
approach gets wrapped into a reusable skill – a Kotlin snippet stored as a markdown document.
Now any agent can call it without re-discovering the API surface.&lt;/p&gt;

&lt;p&gt;That’s the skill factory. Each solved problem becomes a skill. Skills accumulate. The agent that
struggled for 8 retries to find deprecated methods with zero usages? Next time it takes one call.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;MCP Steroid repo&lt;/a&gt; ships with built-in articles that
serve a dual purpose – they’re documentation for you, and reference material that agents read to
create new skills. When your agent needs to figure out VFS or the debugger API, it reads these
articles just like you would. Check the &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/&quot;&gt;docs&lt;/a&gt; for the
full set.&lt;/p&gt;

&lt;h2 id=&quot;why-skills-not-plugin-code&quot;&gt;Why Skills, Not Plugin Code&lt;/h2&gt;

&lt;p&gt;Here’s what makes skills compelling: writing a skill is one markdown file with a Kotlin code snippet.
Writing the same capability as traditional plugin code means a Gradle project, extension points,
build infrastructure, and a full development cycle. Skills skip all of that – the agent handles
compilation, threading, and iteration for you. You just need to point it in the right direction.&lt;/p&gt;

&lt;p&gt;This also works well with
&lt;a href=&quot;/blog/2026/01/30/orchestrating-ai-fleets/&quot;&gt;sub-agent architectures&lt;/a&gt;. A parent agent
can delegate a specific task to a sub-agent that reads only the relevant skill documentation from
MCP resources. The sub-agent iterates until it works, and the parent’s context stays clean. This is
how I run it in practice – the orchestrator picks the skill, the worker executes it.&lt;/p&gt;

&lt;h2 id=&quot;example-a-complete-skill&quot;&gt;Example: A Complete Skill&lt;/h2&gt;

&lt;p&gt;Here’s a working skill that finds all TODO comments in a project:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;com.intellij.psi.search.PsiSearchHelper&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;com.intellij.psi.search.GlobalSearchScope&lt;/span&gt;

&lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;todoItems&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;readAction&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;searchHelper&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;PsiSearchHelper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;project&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mutableListOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;searchHelper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;processCommentsContainingIdentifier&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;TODO&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;GlobalSearchScope&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;projectScope&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;project&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;comment&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;-&amp;gt;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;${comment.containingFile.virtualFile.path}: ${comment.text.trim()}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;todoItems&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;forEach&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;An agent sends this through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;steroid_execute_code&lt;/code&gt; and gets back every TODO comment with its file
path. No plugin development, no Gradle project, no build infrastructure. One snippet, one call.&lt;/p&gt;

&lt;p&gt;This is the kind of thing that takes an agent 4-8 retries to discover the first time –
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PsiSearchHelper&lt;/code&gt; isn’t obvious, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GlobalSearchScope&lt;/code&gt; parameter is easy to miss, and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;readAction&lt;/code&gt; is required for PSI access. Once it’s a skill, every future agent gets it in one shot.&lt;/p&gt;

&lt;h2 id=&quot;enterprise-your-own-skills&quot;&gt;Enterprise: Your Own Skills&lt;/h2&gt;

&lt;p&gt;Even if you’re behind an enterprise firewall with proprietary IntelliJ plugins and internal APIs,
you can create skills that wrap your own extensions. Your agent doesn’t need to know the internals
of your company’s custom inspection plugin – it just calls the skill.&lt;/p&gt;

&lt;p&gt;This is deliberately the same pattern. Open-source skills for the public IntelliJ APIs, private
skills for your internal tooling. A team can build up a skill library for their specific codebase
and workflows, and every agent on the team benefits.&lt;/p&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next&lt;/h2&gt;

&lt;p&gt;The immediate focus is skill coverage. More documented API patterns, more worked examples, more
pre-built skills. The goal is to shrink the research phase for common tasks until it’s nearly zero.&lt;/p&gt;

&lt;p&gt;Longer term, I want to explore event-driven skills. IntelliJ already has APIs for subscribing to
events – a commit happens, a test fails, an inspection fires. An agent that reacts to those events
automatically, running the right skill in response, would be genuinely useful. The event APIs
exist. We haven’t wired them into MCP Steroid yet.&lt;/p&gt;

&lt;p&gt;And headless support – running the plugin in Docker containers, in CI – is an active investment.
We already &lt;a href=&quot;/blog/2026/02/21/testing-mcp-server-with-ai-agents/&quot;&gt;test MCP servers with real agents in Docker&lt;/a&gt;.
Agents don’t always need a GUI, and we want MCP Steroid to work without one. That requires
packaging work that JetBrains infrastructure makes possible.&lt;/p&gt;

&lt;h2 id=&quot;get-involved&quot;&gt;Get Involved&lt;/h2&gt;

&lt;p&gt;The best way to contribute to MCP Steroid is to create a skill. Pick an IDE capability you use
manually – navigate to declaration, find all usages of a deprecated API, list failing tests with
their stack traces, whatever is useful in your workflow. Point your agent at it. Let it figure
out the API. When it works, save the Kotlin as a skill file and open a pull request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Star and fork the repository&lt;/strong&gt; to get started:
&lt;a href=&quot;https://github.com/jonnyzzz/mcp-steroid&quot;&gt;github.com/jonnyzzz/mcp-steroid&lt;/a&gt; (original source) and
&lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;github.com/JetBrains/mcp-steroid&lt;/a&gt; (JetBrains fork).
The &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/&quot;&gt;docs&lt;/a&gt; have worked examples, and the
&lt;a href=&quot;https://discord.com/invite/e9qgQ7NeTC&quot;&gt;Discord&lt;/a&gt; is where the community is building up.&lt;/p&gt;

&lt;p&gt;You don’t need to be an IntelliJ platform expert. The agent does the API exploration. You need to
know what IDE capability you want to expose – the rest is iteration.&lt;/p&gt;

&lt;h2 id=&quot;try-it&quot;&gt;Try It&lt;/h2&gt;

&lt;p&gt;According to the &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/strategy/&quot;&gt;MCP Steroid strategy&lt;/a&gt;, we start
from an explicit plugin for IntelliJ-based IDEs and Android Studio. The long-term target is a
headless, self-contained runtime – the IDE for AI agents without the UI. But you can use it today.&lt;/p&gt;

&lt;p&gt;You’ll need any IntelliJ-based IDE or Android Studio (version 2025.3+). Install the plugin from &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid/releases&quot;&gt;GitHub Releases&lt;/a&gt;
– JetBrains Marketplace listing is coming soon. Once started, the plugin generates
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.idea/mcp-steroid.md&lt;/code&gt; in your project with instructions. Follow those to register the MCP Server
with your agent. The server uses the Streamable HTTP protocol.&lt;/p&gt;

&lt;h2 id=&quot;poc-program&quot;&gt;PoC Program&lt;/h2&gt;

&lt;p&gt;We’re opening a PoC program for companies that want MCP Steroid customized for their use cases –
internal skills, proprietary API integrations, team-specific workflows. If that’s interesting,
reach out to me on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="mcp" />
  
    <category term="mcp-steroid" />
  
    <category term="intellij" />
  
    <category term="android-studio" />
  
    <category term="skills" />
  
    <category term="agentic-coding" />
  
    <category term="kotlin" />
  
    <summary type="html">Every time your agent hits an unfamiliar IntelliJ API, it burns tokens rediscovering PSI trees and threading rules. MCP Steroid turns that pain into reusable skills -- Kotlin snippets that any agent can call without re-learning the IDE.</summary>
  
  </entry>
  
  <entry>
    <title type="html">MCP Steroid Is Now Open Source</title>
    <link href="https://jonnyzzz.com/blog/2026/04/07/mcp-steroid-open-source/" rel="alternate" type="text/html" title="MCP Steroid Is Now Open Source" />
    <published>2026-04-07T00:00:00+00:00</published>
    <updated>2026-04-07T00:00:00+00:00</updated>
    <id>/blog/2026/04/07/mcp-steroid-open-source</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/04/07/mcp-steroid-open-source/">&lt;p&gt;MCP Steroid is now open source under the Apache 2.0 license. The source is at
&lt;a href=&quot;https://github.com/jonnyzzz/mcp-steroid&quot;&gt;github.com/jonnyzzz/mcp-steroid&lt;/a&gt;, with the project
transitioning under JetBrains at &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;github.com/JetBrains/mcp-steroid&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;the-bridge&quot;&gt;The Bridge&lt;/h2&gt;

&lt;p&gt;AI Agents can think. They cannot always act. The gap between what agents can reason about and
what they can actually do inside a codebase – that’s the bottleneck I’ve been working on.&lt;/p&gt;

&lt;p&gt;I build products where the AI Agent is the user. No UI – APIs, CLIs, MCPs. People keep designing
products for humans with AI bolted on. I’m doing the opposite: we make JetBrains tools available
for AI agents, not for humans.&lt;/p&gt;

&lt;p&gt;IntelliJ-based IDEs have shipped with a built-in MCP server since 2025.2 – curated tools for common
operations like running configurations, file edits, and basic code actions. Useful, but fixed.&lt;/p&gt;

&lt;p&gt;MCP Steroid is different. It gives AI Agents the ability to write and execute arbitrary Kotlin code against
the full IntelliJ Platform API. Not a predefined menu of operations – the actual APIs that power
the IDE’s own features: inspections, refactorings, the debugger, PSI trees, screenshots.
It can access any plugins and third-party extensions too! AI Agents
get the whole IDE, not a curated subset. On &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/strategy/&quot;&gt;DPAI Arena&lt;/a&gt;
benchmarks, agents with MCP Steroid are 20-54% faster on tasks requiring semantic understanding –
54% on rename-refactoring across 9 files, 20% on multi-layer generation across 15 files. Simple text
replacements show no improvement, which is expected.&lt;/p&gt;

&lt;p&gt;I wrote more about this framing in
&lt;a href=&quot;/blog/2026/03/24/agentic-experience-and-tools/&quot;&gt;Agentic Experience and Tools&lt;/a&gt;
and in the &lt;a href=&quot;/blog/2026/01/04/mcp-steroids-intellij/&quot;&gt;original MCP Steroid post&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;how-it-got-here&quot;&gt;How It Got Here&lt;/h2&gt;

&lt;p&gt;I started MCP Steroid in December 2025 as a free-time experiment – one engineer building a Bridge
between AI Agents and IntelliJ’s APIs. It was built with AI agents. The project uses MCP Steroid
itself plus a &lt;a href=&quot;/blog/2026/02/06/run-agent-multi-agent-orchestration/&quot;&gt;run-agent.sh&lt;/a&gt;
swarm for development. The main effort went into evals and integration tests – making sure agents
actually succeed at real tasks, not just compile. MCP Steroid plugin learned from its one development too.&lt;/p&gt;

&lt;p&gt;The codebase has 84 prompt markdown files – 37% of production source by line count – that teach
agents how to navigate IntelliJ APIs: the debugger, refactorings, inspections, test runners, VCS.
Just like JetBrains IDEs make software developers more professional, these resources make AI agents
more professional. There are integration tests for Claude Code, Codex, and Gemini, all
&lt;a href=&quot;/blog/2026/02/21/testing-mcp-server-with-ai-agents/&quot;&gt;running in Docker&lt;/a&gt;
with full IDE containers.&lt;/p&gt;

&lt;p&gt;We validated that AI Agents can write and call Kotlin code to solve specific tasks with the help of IntelliJ.
This is quality work
and a real experiment with 1,690 commits over four months, not a rough prototype –
see the &lt;a href=&quot;/blog/2026/02/23/mcp-steroid-project-assessment/&quot;&gt;project assessment at 75 days&lt;/a&gt;
for the earlier snapshot.&lt;/p&gt;

&lt;p&gt;In March 2026, we agreed to move the project under JetBrains and start using it internally. The original source is at
&lt;a href=&quot;https://github.com/jonnyzzz/mcp-steroid&quot;&gt;github.com/jonnyzzz/mcp-steroid&lt;/a&gt;, with the JetBrains fork
at &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;github.com/JetBrains/mcp-steroid&lt;/a&gt;, released under
Apache 2.0. I’m still the main contributor.&lt;/p&gt;

&lt;h2 id=&quot;skills--a-tease&quot;&gt;Skills – a Tease&lt;/h2&gt;

&lt;p&gt;The real value isn’t individual API calls. It’s that solved problems accumulate into reusable skills.
An agent struggles through IntelliJ’s API once, figures out the working approach, and that approach
gets wrapped into a skill any agent can call without re-discovering the surface.&lt;/p&gt;

&lt;p&gt;The skill prototyping process has become magnitude times cheaper – 
make an agent research the &lt;a href=&quot;https://github.com/JetBrains/intellij-community&quot;&gt;IntelliJ Community&lt;/a&gt;
and figure out how to solve your task. Then, turn it as a skill which uses the MCP Steroid plugin. 
Eval it.&lt;/p&gt;

&lt;p&gt;I’ve got a dedicated post about the
&lt;a href=&quot;/blog/2026/04/08/mcp-steroid-skill-factory/&quot;&gt;skill factory&lt;/a&gt; coming next –
with code examples, the two-phase workflow, and the enterprise angle. For now: think of it as the
difference between giving an agent a tool and giving it a way to build its own tools.&lt;/p&gt;

&lt;h2 id=&quot;try-it&quot;&gt;Try It&lt;/h2&gt;

&lt;p&gt;According to our &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/strategy/&quot;&gt;strategy&lt;/a&gt;, we start from an
explicit plugin for IntelliJ-based IDEs (including any third-party IDEs, e.g. &lt;strong&gt;Android Studio&lt;/strong&gt;).
The long-term target is a headless
self-contained runtime – available as SaaS and as an end-user product – the headless 
&lt;strong&gt;IDE for AI Agents&lt;/strong&gt;. The project requires investment into packaging for headless environments, and that’s where
JetBrains infrastructure matters.&lt;/p&gt;

&lt;p&gt;To get started today: install the MCP Steroid plugin in any IntelliJ-based IDE or Android Studio
(version 2025.3+). The plugin generates &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.idea/mcp-steroid.md&lt;/code&gt; in your project with instructions.
Follow those instructions to register the MCP Server with your agent. The server uses the
Streamable HTTP protocol.&lt;/p&gt;

&lt;p&gt;Quick sanity check – ask your agent to run a Kotlin code like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;println(project.name)&lt;/code&gt; or ask it
“What do I see in my IDE”&lt;/p&gt;

&lt;p&gt;See it in action: &lt;a href=&quot;https://www.youtube.com/playlist?list=PLitZWClhc4Qgz3w8qrtctMR_lpIc81n0f&quot;&gt;YouTube playlist&lt;/a&gt;,
and here’s &lt;a href=&quot;https://www.youtube.com/watch?v=HtDDNyAoLak&quot;&gt;Codex debugging an application in IntelliJ IDEA&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;get-involved&quot;&gt;Get Involved&lt;/h2&gt;

&lt;p&gt;The project is open source, and the community is just getting started. If you’ve explored IntelliJ
APIs, built agents, or solved IDE automation problems – that knowledge is useful here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Star and fork the repository.&lt;/strong&gt; The original source is at
&lt;a href=&quot;https://github.com/jonnyzzz/mcp-steroid&quot;&gt;github.com/jonnyzzz/mcp-steroid&lt;/a&gt; and the JetBrains fork
is at &lt;a href=&quot;https://github.com/JetBrains/mcp-steroid&quot;&gt;github.com/JetBrains/mcp-steroid&lt;/a&gt;. Fork it, experiment with it, write
a skill, open an issue, improve the docs. Contributions of any kind are welcome.&lt;/p&gt;

&lt;p&gt;You can just try the plugin, download it from &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/releases&quot;&gt;the website&lt;/a&gt;
or from JetBrains Marketplace (soon). Share what you build, and
join the &lt;a href=&quot;https://discord.com/invite/e9qgQ7NeTC&quot;&gt;Discord&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s Next&lt;/h2&gt;

&lt;p&gt;With JetBrains infrastructure, support, and investment, I believe MCP Steroid will be delivered
sooner and enable AI Agents of any kind with the best tools human developers love in
IntelliJ-based products.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;/blog/2026/04/08/mcp-steroid-skill-factory/&quot;&gt;skill factory post&lt;/a&gt; is coming
tomorrow. And if you’re wondering what happens when the plugin ZIP exceeds 180 MB,
&lt;a href=&quot;/blog/2026/04/17/intellij-plugin-hot-reload-413/&quot;&gt;the 413 post&lt;/a&gt; covers that.
We’re also opening a proof-of-concept program for
companies that want to use MCP Steroid for their specific workflows. If that’s interesting,
reach out on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or in the
&lt;a href=&quot;https://discord.com/invite/e9qgQ7NeTC&quot;&gt;Discord&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="mcp" />
  
    <category term="mcp-steroid" />
  
    <category term="intellij" />
  
    <category term="android-studio" />
  
    <category term="open-source" />
  
    <category term="agentic-coding" />
  
    <summary type="html">MCP Steroid -- the plugin that gives AI Agents the full IntelliJ IDE runtime -- is now open source under Apache 2.0. Original source at github.com/jonnyzzz/mcp-steroid, transitioning under JetBrains at github.com/JetBrains/mcp-steroid.</summary>
  
  </entry>
  
  <entry>
    <title type="html">Agentic Experience and Tools: My Program</title>
    <link href="https://jonnyzzz.com/blog/2026/03/24/agentic-experience-and-tools/" rel="alternate" type="text/html" title="Agentic Experience and Tools: My Program" />
    <published>2026-03-24T00:00:00+00:00</published>
    <updated>2026-03-24T00:00:00+00:00</updated>
    <id>/blog/2026/03/24/agentic-experience-and-tools</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/03/24/agentic-experience-and-tools/">&lt;blockquote&gt;
  &lt;p&gt;AI Agents can think. They cannot always act. I build the &lt;strong&gt;bridges&lt;/strong&gt; between
agent intelligence and the real-world tools that agents cannot reach.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;img src=&quot;/images/agentic-experience-and-tools.png&quot; alt=&quot;Agentic Experience and Tools&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I have spent twenty+ years at &lt;a href=&quot;https://jetbrains.com&quot;&gt;JetBrains&lt;/a&gt; making software developers more
productive. I started &lt;a href=&quot;https://www.jetbrains.com/ide-services/&quot;&gt;IDE Services&lt;/a&gt; from zero, grew 
it to 500+ enterprise customers, and handed it to the team. I have given fifty-plus &lt;a href=&quot;/talks&quot;&gt;talks&lt;/a&gt; on developer
tooling around the world. &lt;strong&gt;I am a founder&lt;/strong&gt; – I create products, take them
from inception to stable results, and move on to the next thing. That is
what drives me.&lt;/p&gt;

&lt;p&gt;In late 2025 I saw a shift. AI Agents crossed a threshold. They could read
code, reason about architecture, propose changes, write tests, iterate
on feedback, deploy services. The intelligence was there.&lt;/p&gt;

&lt;p&gt;But agents live in a world of tokens and text. Our development
infrastructure – IDEs, CI pipelines, code review systems, debuggers –
lives in the human world. Agents cannot run your IDE’s thousand
inspections. They cannot trigger your CI pipeline. They cannot start a
debugger. They cannot navigate your code review system or debug your GUI app.&lt;/p&gt;

&lt;p&gt;The gap between what agents can think and what they can do is the
defining bottleneck of this era. &lt;strong&gt;I am building the bridges that close it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I call this domain &lt;strong&gt;Agentic Experience &amp;amp; Tools&lt;/strong&gt;. For me, I work in that domain
and in the scope of exiting old products, where Agentic Experience and Tools are the most
challenging to deliver.&lt;/p&gt;

&lt;h2 id=&quot;the-mission-agentic-experience--tools&quot;&gt;The Mission: Agentic Experience &amp;amp; Tools&lt;/h2&gt;

&lt;p&gt;I make AI Agents more effective at solving problems on real-world projects
by building the missing connections between agents and existing tools,
processes, and infrastructure – including optimizing how agents work
together, which models they use, and how they learn from their own runs.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The AI Agent is not an implementation detail. &lt;strong&gt;It is the user,&lt;/strong&gt; and the product persona.
The human is the stakeholder.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The biggest &lt;strong&gt;misconception&lt;/strong&gt; I see: people keep designing products for
humans with AI tools bolted on. I am doing the opposite. I build products
intended directly for agentic usage. No UX, no UI in the traditional
sense – APIs, &lt;a href=&quot;/blog/2026/02/20/cli-tools-for-ai-agents/&quot;&gt;CLIs&lt;/a&gt;, &lt;a href=&quot;https://modelcontextprotocol.io/&quot;&gt;MCPs&lt;/a&gt;, and
structured outputs that agents consume natively. Humans buy and install
these tools to improve the performance metrics of their agents. Later,
other AI Agents may do the buying too.&lt;/p&gt;

&lt;p&gt;Only agents can validate what is actually better for agents. So I use an
agentic learning process: agents define the evaluation, run it, and
self-improve based on collected data. When I built the
&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/&quot;&gt;MCP Steroid&lt;/a&gt; debugger, one agent ran
the debug task through IntelliJ while another reviewed the logs and
rewrote the prompts. Multiple times. After iterations, the agent went from failing –
falling back to screenshots just to see what was happening – to solving
the debug task on the first attempt. That is the kind of improvement I
am after: measurable, agent-validated, compounding.&lt;/p&gt;

&lt;p&gt;I do not compete with AI Agents. I make every agent better. Whatever your
team already uses – Claude, Codex, Gemini, Junie, Cursor, Augment, Kilo –
my tools remove the walls between that agent and your development
infrastructure. I integrate with all of them. I replace none of them.
When there is no AI Agents in use, I enable them.&lt;/p&gt;

&lt;h2 id=&quot;the-agentic-loop&quot;&gt;The Agentic Loop&lt;/h2&gt;

&lt;p&gt;Agents need real-world feedback: build the code, run the tests, deploy
to staging. Once these loops are established, agents iterate on their
own, improving quality with each pass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The principle: agents follow the same processes as humans.&lt;/strong&gt; No
shortcuts. No special agent-only paths. That is what makes the output
trustworthy. The human decides how much autonomy to grant, and signs off
on the result.&lt;/p&gt;

&lt;h2 id=&quot;what-is-still-broken&quot;&gt;What Is Still Broken&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agents cannot react to events yet.&lt;/strong&gt; Today, a human kicks off agent
runs. Agents should wake up automatically when a CI build fails, a code
review gets comments, or a merge pipeline completes. I am looking for
design partners to implement event-driven workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setup takes too long.&lt;/strong&gt; Configuring the full stack is a 30-minute
process. The goal is one Docker command to start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agents do not learn enough from failures.&lt;/strong&gt; Every agent run generates
telemetry. That data should feed back automatically – agents reporting
problems to other agents that fix them.&lt;/p&gt;

&lt;h2 id=&quot;what-i-have-built&quot;&gt;What I Have Built&lt;/h2&gt;

&lt;p&gt;I have built a portfolio of tools that address different parts of the gap:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/&quot;&gt;MCP Steroid&lt;/a&gt; gives agents access to
the full IntelliJ IDE runtime – inspections, refactorings, debugger,
screenshots. On DPAIA benchmarks, agents with MCP Steroid are
&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/&quot;&gt;20-54% faster&lt;/a&gt; on tasks requiring
semantic understanding and multi-file refactoring. It works with Claude,
Codex, Gemini, Cursor, and any MCP client. See it in action: the
&lt;a href=&quot;https://youtube.com/playlist?list=PLitZWClhc4Qgz3w8qrtctMR_lpIc81n0f&quot;&gt;demo playlist&lt;/a&gt;
includes debugger integration, monorepo deep dives, and more.
&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/releases/&quot;&gt;Try it&lt;/a&gt; on your IDE today.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://run-agent.jonnyzzz.com/&quot;&gt;run-agent.sh&lt;/a&gt; orchestrates agent
swarms with full isolation and traceability – up to 16 agents in
parallel, coordinating through an append-only
&lt;a href=&quot;https://run-agent.jonnyzzz.com/MESSAGE-BUS.md&quot;&gt;message bus&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/blog/2026/01/05/rlm-multi-agent-orchestration/&quot;&gt;RLM&lt;/a&gt; decomposes tasks that exceed one
context window into sub-agent work.&lt;/p&gt;

&lt;p&gt;Dedicated posts about each are coming.&lt;/p&gt;

&lt;h2 id=&quot;find-the-others&quot;&gt;Find the Others&lt;/h2&gt;

&lt;p&gt;I am looking for the people who are actually doing this – not
theorizing, not pitching, but deploying agents on real codebases and
hitting real walls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If your agents are stuck&lt;/strong&gt; – they can generate code but cannot get it
reviewed, tested, merged, or deployed – I want to hear about it. That
is the exact gap I work on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you are building tools that agents use&lt;/strong&gt;, I want to compare notes.
What works, what breaks, what the models still cannot do.&lt;/p&gt;

&lt;p&gt;Join the conversation and follow me on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; – that
is where most of the discussion happens.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Andrej Karpathy on the &lt;a href=&quot;https://podcasts.apple.com/us/podcast/no-priors-artificial-intelligence-technology-startups/id1668002688&quot;&gt;No Priors podcast&lt;/a&gt;
(March 2026):&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Just I think everything, like so many things, even if they
don’t work, I think to a large
extent you feel like it’s a skill issue.
It’s not that the capability is not there; it’s that you
just haven’t found a way to string together what’s available. Like, I
didn’t give good enough instructions to the agents.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;p&gt;The agents are capable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The bridges are missing. I build them.&lt;/strong&gt; Reach out on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or &lt;a href=&quot;https://x.com/jonnyzzz&quot;&gt;X&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="ai-agents" />
  
    <category term="mcp" />
  
    <category term="mcp-steroid" />
  
    <category term="ai-coding" />
  
    <category term="developer-experience" />
  
    <category term="agentic-experience" />
  
    <summary type="html">I build bridges between AI agents and the tools we humans use every day. I call this domain Agentic Experience and Tools. This is my program.</summary>
  
  </entry>
  
  <entry>
    <title type="html">run-agent.sh v2.0: Hardened Runner with 115 Tests, Random Selection, and Agent Environment Contract</title>
    <link href="https://jonnyzzz.com/blog/2026/03/17/run-agent-v2-release/" rel="alternate" type="text/html" title="run-agent.sh v2.0: Hardened Runner with 115 Tests, Random Selection, and Agent Environment Contract" />
    <published>2026-03-17T00:00:00+00:00</published>
    <updated>2026-03-17T00:00:00+00:00</updated>
    <id>/blog/2026/03/17/run-agent-v2-release</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/03/17/run-agent-v2-release/">&lt;p&gt;I’ve been running AI Agents in parallel for weeks — Claude Code, Codex, Gemini —
and the launcher script that ties them together kept accumulating rough edges.
Stale PID files after crashes. No tests. No help output. And, as three AI reviewers
independently discovered, a command injection vulnerability hiding in plain sight.&lt;/p&gt;

&lt;p&gt;Today I’m releasing &lt;a href=&quot;https://github.com/jonnyzzz/run-agent/releases/tag/v2.0.0&quot;&gt;run-agent.sh v2.0&lt;/a&gt; — a major rewrite that fixes all of that.&lt;/p&gt;

&lt;p&gt;The script started as a quick 95-line wrapper. It now weighs in at 211 lines,
backed by 115 acceptance tests, random agent selection, and a proper environment
contract for every spawned agent.&lt;/p&gt;

&lt;h2 id=&quot;what-is-run-agentsh&quot;&gt;What is run-agent.sh?&lt;/h2&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run-agent.sh&lt;/code&gt; enables you agents to start more agent processes for sub-tasks.
That plays nice with Recursive Language Models (&lt;a href=&quot;https://jonnyzzz.com/RLM.md&quot;&gt;RLM.md&lt;/a&gt;),
or &lt;a href=&quot;https://run-agent.jonnyzzz.com/THE_PROMPT_v5.md&quot;&gt;THE_PROMPT_v5.md&lt;/a&gt; swarms,
where you use more agents. In simple words, running multiple agents allows each
agent to do fewer things at once via delegation, delivering better outcomes, while avoiding the
context rot or overflow.&lt;/p&gt;

&lt;p&gt;As an example, use the prompt like&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;lt;PUT YOUR TASK DESCRIPTION HERE&amp;gt;

In order to deliver on the task, you should use https://run-agent.jonnyzzz.com/run-agent.sh script
to start more tasks. You should follow the https://run-agent.jonnyzzz.com/THE_PROMPT_v5.md and
other files relative to it as the main process. Your purpose is to orchestrate and delegate
the work to other run-agent&apos;s which you start, you must not do the work yourself.
So create /loop when necessary to monitor the process. Never stop unless the work is completed.

All your promots should use the https://run-agent.jonnyzzz.com/MESSAGE-BUS.md as the key
communication principle. 

Make sure you download the files locally and use the full paths to the files below.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://run-agent.jonnyzzz.com&quot;&gt;run-agent.sh&lt;/a&gt; is a unified shell script that launches AI coding agents
in isolated sub-processes, with selected current directories. Each invocation creates a timestamped folder with
the prompt copy, captured stdout/stderr, process metadata, and a copy of the
runner itself — full traceability for every agent execution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is the tool for AI Agents to help consolidate and coupe with the work more effectively&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./run-agent.sh claude ~/Work/project ./task-prompt.md
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It powers the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;marinade&lt;/code&gt; agentic orchestration framework, where a root
AI Agent spawns sub-agents in parallel.&lt;/p&gt;

&lt;h2 id=&quot;random-agent-selection&quot;&gt;Random agent selection&lt;/h2&gt;

&lt;p&gt;One thing I noticed while orchestrating multi-agent runs: I kept manually
rotating between agents to diversify the results. That felt like a job for the
script, not for me.&lt;/p&gt;

&lt;p&gt;The default agent is now &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;any&lt;/code&gt; — the script picks a random agent from the
available pool:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./run-agent.sh any ~/Work/project ./prompt.md
&lt;span class=&quot;c&quot;&gt;# AGENT_SELECTED=gemini&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# RUN_ID=run_20260317-134514-27961&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# RUN_DIR=./runs/run_20260317-134514-27961&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is useful when you want to let agents compete or diversify across runs.
When combined with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUN_AGENT_AGENTS&lt;/code&gt;, the random selection respects the
restricted pool:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Set at the environment level&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;export &lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;RUN_AGENT_AGENTS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;claude,codex

&lt;span class=&quot;c&quot;&gt;# The actual call&lt;/span&gt;
./run-agent.sh any ~/Work/project ./prompt.md
&lt;span class=&quot;c&quot;&gt;# Never picks gemini&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;agent-availability-control&quot;&gt;Agent availability control&lt;/h2&gt;

&lt;p&gt;A new environment variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUN_AGENT_AGENTS&lt;/code&gt; controls which agents are
available. Agents not in the list are rejected and hidden from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--help&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Only claude and codex are available&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;RUN_AGENT_AGENTS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;claude,codex ./run-agent.sh gemini ...
&lt;span class=&quot;c&quot;&gt;# stderr: Unknown agent: gemini&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# stderr: Known agents: claude,codex&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# exit 2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Invalid names in the list are caught at startup:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;RUN_AGENT_AGENTS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;claude,fakename ./run-agent.sh claude ...
&lt;span class=&quot;c&quot;&gt;# stderr: RUN_AGENT_AGENTS: unknown agent &apos;fakename&apos;. Built-in agents: codex,claude,gemini&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# exit 2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;environment-contract&quot;&gt;Environment contract&lt;/h2&gt;

&lt;p&gt;Before this release, agents had to guess paths or rely on convention.
Now every agent gets four exported environment variables — no guesswork:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Variable&lt;/th&gt;
      &lt;th&gt;Value&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUNS_DIR&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Absolute path to the runs directory&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MESSAGE_BUS&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Absolute path to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MESSAGE-BUS.md&lt;/code&gt; (now inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUNS_DIR&lt;/code&gt;)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUN_ID&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Unique run identifier (e.g., &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run_20260317-134514-27961&lt;/code&gt;)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PROMPT&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Absolute path to the copied prompt file&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Every agent knows exactly where it is, where to write messages,
and what run it belongs to.&lt;/p&gt;

&lt;p&gt;Additionally, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CLAUDECODE&lt;/code&gt; is explicitly &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unset&lt;/code&gt; before spawning —
preventing nested Claude Code runtime context from leaking into
sub-agents. I learned this the hard way: a Claude Code sub-agent
was picking up its parent’s runtime state and behaving differently
than when launched standalone.&lt;/p&gt;

&lt;h2 id=&quot;115-acceptance-tests-zero-api-keys&quot;&gt;115 acceptance tests, zero API keys&lt;/h2&gt;

&lt;p&gt;The test suite uses mock agent stubs — small bash scripts that simulate
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;claude&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;codex&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gemini&lt;/code&gt; without making any API calls. The full
suite runs in under 10 seconds:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;bash tests/test-run-agent.sh
&lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; run-agent.sh Acceptance Tests &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;---&lt;/span&gt; Test 1: Script structure &lt;span class=&quot;nt&quot;&gt;---&lt;/span&gt;
  PASS: run-agent.sh is executable
  PASS: run-agent.sh has bash shebang
  PASS: run-agent.sh uses &lt;span class=&quot;nb&quot;&gt;set&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-euo&lt;/span&gt; pipefail
...
&lt;span class=&quot;o&quot;&gt;===&lt;/span&gt; Test Results &lt;span class=&quot;o&quot;&gt;===&lt;/span&gt;
  PASSED: 115
  FAILED: 0
RESULT: PASS &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;all 115 tests passed&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The tests cover what matters:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Prompt delivery&lt;/strong&gt;: a mock agent that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cat&lt;/code&gt;s stdin, verifying prompt
content reaches the agent&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;CLI arguments&lt;/strong&gt;: a mock that echoes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;$@&quot;&lt;/code&gt;, verifying
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--permission-mode&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-C&lt;/code&gt;, etc.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;CWD with spaces&lt;/strong&gt;: creates a directory with spaces, runs an agent,
checks &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pwd&lt;/code&gt; output&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Injection rejection&lt;/strong&gt;: passes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$(echo pwned)&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;foo;bar&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;../etc&lt;/code&gt;
as agent names — all exit 2&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Environment exports&lt;/strong&gt;: mock agents print &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$RUN_ID&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$PROMPT&lt;/code&gt;,
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$MESSAGE_BUS&lt;/code&gt; — verified against expected values&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;CLAUDECODE sanitization&lt;/strong&gt;: sets &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CLAUDECODE=should_be_removed&lt;/code&gt;,
verifies agent sees it unset&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;cwd.txt correctness&lt;/strong&gt;: checks not just key presence but actual values
(absolute paths, numeric PID, matching RUN_ID)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Help side effects&lt;/strong&gt;: verifies &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--help&lt;/code&gt; creates no run directories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The three-agent review also identified tests that were “vacuously true” —
tests that would pass even if the behavior they claimed to test was broken.
For example, test 15 originally grepped the script source for
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;export MESSAGE_BUS&lt;/code&gt; instead of verifying the agent process actually
received the variable. That is the kind of bug that hides in plain sight
until someone (or some agent) asks the right question.&lt;/p&gt;

&lt;h2 id=&quot;the-exit-code-bug&quot;&gt;The exit code bug&lt;/h2&gt;

&lt;p&gt;This one was subtle. The original script had &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set -euo pipefail&lt;/code&gt; and then:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;wait&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$AGENT_PID&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;EXIT_CODE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PID_FILE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;EXIT_CODE=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$EXIT_CODE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CWD_FILE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When an agent exited non-zero, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set -e&lt;/code&gt; caused the script to bail at
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;wait&lt;/code&gt; before cleaning up &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pid.txt&lt;/code&gt; or recording &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;EXIT_CODE&lt;/code&gt;. Monitoring
scripts that checked &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pid.txt&lt;/code&gt; would think the agent was still running.
I spent more time than I’d like to admit debugging “zombie” agents that
were actually long dead. The fix:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;EXIT_CODE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;0
&lt;span class=&quot;nb&quot;&gt;wait&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$AGENT_PID&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;EXIT_CODE&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$?&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-f&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$PID_FILE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;EXIT_CODE=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$EXIT_CODE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$CWD_FILE&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;|| EXIT_CODE=$?&lt;/code&gt; captures the non-zero exit code without triggering
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set -e&lt;/code&gt;. Standard bash idiom, but easy to miss when you write the happy
path first.&lt;/p&gt;

&lt;h2 id=&quot;cicd&quot;&gt;CI/CD&lt;/h2&gt;

&lt;p&gt;Two GitHub Actions workflows keep the project honest:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Tests&lt;/strong&gt; — runs all 115 acceptance tests on every push&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Deploy&lt;/strong&gt; — syncs static files, builds the site, deploys to GitHub Pages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every push to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt; makes the latest &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run-agent.sh&lt;/code&gt; available at
&lt;a href=&quot;https://run-agent.jonnyzzz.com/run-agent.sh&quot;&gt;run-agent.jonnyzzz.com/run-agent.sh&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;whats-next&quot;&gt;What’s next&lt;/h2&gt;

&lt;p&gt;The script is battle-tested for the orchestration use case, but I want to
push it into real, established projects — not just greenfield AI experiments.
That will likely surface new requirements and rough edges.&lt;/p&gt;

&lt;p&gt;Concrete plans:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Integration tests with real agents&lt;/strong&gt; — mock stubs catch regressions,
but they cannot catch API-level breakage. I want a CI job that actually
calls each agent with a trivial prompt and verifies end-to-end behavior.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Streaming output&lt;/strong&gt; — right now stdout/stderr are captured to files.
For long-running agents, tailing the output and progress would help to 
tell a suck agent from a thinking one.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;More agents&lt;/strong&gt; — the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;case&lt;/code&gt; statement makes it trivial to add new ones.
If you use a different AI coding tool, open a PR.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There’s still so much more we can do. If you run multi-agent orchestration
or have ideas for what the runner should handle, I would love to hear about
it. Reach out on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or &lt;a href=&quot;https://x.com/jonnyzzz&quot;&gt;Twitter/X&lt;/a&gt;.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;The release is at &lt;a href=&quot;https://github.com/jonnyzzz/run-agent/releases/tag/v2.0.0&quot;&gt;github.com/jonnyzzz/run-agent/releases/tag/v2.0.0&lt;/a&gt;.
The script is at &lt;a href=&quot;https://run-agent.jonnyzzz.com&quot;&gt;run-agent.jonnyzzz.com&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="ai-agents" />
  
    <category term="multi-agent" />
  
    <category term="orchestration" />
  
    <category term="cli" />
  
    <category term="testing" />
  
    <category term="automation" />
  
    <category term="ai-coding" />
  
    <category term="dev-tools" />
  
    <category term="sub-agent" />
  
    <summary type="html">run-agent.sh went from a 95-line launcher to a hardened 211-line agent runner with 115 acceptance tests, random agent selection, and a proper environment contract. Here is what changed, why, and how we found a command injection vulnerability along the way.</summary>
  
  </entry>
  
  <entry>
    <title type="html">From Voice Memos to Searchable Text with Local Whisper on macOS</title>
    <link href="https://jonnyzzz.com/blog/2026/02/28/voice-memos-local-ai-transcription/" rel="alternate" type="text/html" title="From Voice Memos to Searchable Text with Local Whisper on macOS" />
    <published>2026-02-28T00:00:00+00:00</published>
    <updated>2026-02-28T00:00:00+00:00</updated>
    <id>/blog/2026/02/28/voice-memos-local-ai-transcription</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/02/28/voice-memos-local-ai-transcription/">&lt;p&gt;I record voice memos constantly – ideas during walks, meeting notes, random thoughts at 2 AM.
The macOS and mainly watchOS Voice Memos app is perfect for capture, but terrible for retrieval. You can’t search
recordings. You can’t grep audio. Those ideas just sit there, locked in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.m4a&lt;/code&gt; files, slowly
becoming irrelevant.&lt;/p&gt;

&lt;p&gt;The obvious solution is transcription. Cloud services like Otter.ai or Whisper API do this
well – but they cost money, require internet, and send your private voice recordings to someone
else’s servers. For notes about work projects, product ideas, and personal reflections, that’s
a non-starter.&lt;/p&gt;

&lt;p&gt;So I built a pipeline that runs &lt;strong&gt;entirely on a MacBook&lt;/strong&gt;. It pulls recordings from Voice Memos
via AppleScript, transcribes them with &lt;a href=&quot;https://github.com/openai/whisper&quot;&gt;Whisper&lt;/a&gt; running locally on Apple Silicon, and
outputs searchable markdown files. No cloud. No API keys. No subscriptions.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Want to skip the explanation and just run it?&lt;/strong&gt; The “Putting It All Together” section has
a complete, copy-paste-and-&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uv run&lt;/code&gt; script. The production version with state management
and CLI options lives in my internal repository.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;why-locaai-transcription-matters&quot;&gt;Why LocaAI Transcription Matters&lt;/h2&gt;

&lt;p&gt;Before diving into the code, here’s why this matters beyond privacy:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;No recurring cost.&lt;/strong&gt; Cloud transcription services charge per minute of audio. A 30-minute
daily voice memo habit costs $15–50/month. Local inference costs electricity.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Works offline.&lt;/strong&gt; On planes, in tunnels, in countries with restricted internet – your
transcription pipeline doesn’t care.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;No data leaves your machine.&lt;/strong&gt; Voice memos often contain sensitive content: product ideas,
performance reviews, personal reflections. Local processing keeps it local.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Unlimited volume.&lt;/strong&gt; No API rate limits, no monthly quotas. Transcribe your entire archive
overnight.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The catch? You need an Apple Silicon Mac with enough RAM, and the first-time model download is
~1.6 GB. After that, everything runs locally.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;system-requirements&quot;&gt;System Requirements&lt;/h2&gt;

&lt;p&gt;Here’s what you need to run this pipeline. I’ve tested it on an M1 MacBook Air (16 GB) and an
M4 Max MacBook Pro (128 GB):&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Component&lt;/th&gt;
      &lt;th&gt;Requirement&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Mac&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Apple Silicon (M1, M2, M3, M4 – any variant)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RAM&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;16 GB, 32 GB+ recommended&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Disk&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;~1.6 GB for the model + ~1 MB per hour of recordings&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;macOS&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Sequoia 15.x (tested); Sonoma 14.x likely works&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Python&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;3.11+&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;uv&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Latest (&lt;a href=&quot;https://docs.astral.sh/uv/&quot;&gt;astral.sh/uv&lt;/a&gt;)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ffmpeg&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Required by mlx-whisper (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;brew install ffmpeg&lt;/code&gt;)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Accessibility&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Terminal must have Accessibility permission&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The Whisper large-v3-turbo model runs comfortably on an M1 with 16 GB. On an M4 Max, it
transcribes roughly 10x faster than real-time – a 10-minute memo in about 60 seconds. On a
base M1, expect roughly 1x real-time (1 minute of audio takes ~1 minute of processing).&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-architecture&quot;&gt;The Architecture&lt;/h2&gt;

&lt;p&gt;The pipeline has two independent phases:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;┌─────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  Voice Memos    │     │  Export Audio    │     │  Transcribe      │
│  (macOS app)    │────▶│  (AppleScript    │────▶│  (mlx-whisper    │
│                 │     │   + clipboard)   │     │   or Ollama)     │
└─────────────────┘     └──────────────────┘     └──────────────────┘
                              │                         │
                              ▼                         ▼
                         .m4a files                .md transcripts
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Phase 1: Fetch&lt;/strong&gt; – AppleScript reads the Voice Memos sidebar, selects each recording, copies
it via Cmd+C, and saves the M4A data from the clipboard to disk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2: Transcribe&lt;/strong&gt; – A local Whisper model converts each audio file to text. Two backends
are supported: &lt;a href=&quot;https://pypi.org/project/mlx-whisper/&quot;&gt;mlx-whisper&lt;/a&gt; (Apple Silicon native) and &lt;a href=&quot;https://ollama.com&quot;&gt;Ollama&lt;/a&gt; (more
flexible, supports NVIDIA GPUs too).&lt;/p&gt;

&lt;p&gt;Both phases are fully incremental – they skip recordings that have already been processed.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-problem-with-voice-memos-and-what-i-tried-first&quot;&gt;The Problem with Voice Memos (and What I Tried First)&lt;/h2&gt;

&lt;p&gt;Here’s the first challenge: &lt;strong&gt;macOS Voice Memos has no API&lt;/strong&gt;. There’s no AppleScript dictionary,
no command-line tool, no SQLite database you can query. Apple doesn’t expose the recordings
through any documented interface.&lt;/p&gt;

&lt;p&gt;My first attempt was to access the files directly. The recordings live under
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/Library/Group Containers/group.com.apple.VoiceMemos.shared/&lt;/code&gt;, in a CoreData/CloudKit-backed
structure. I wrote a script that found &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.m4a&lt;/code&gt; files in there and copied them out. It worked –
until iCloud sync kicked in.&lt;/p&gt;

&lt;p&gt;My second attempt used &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NSSharingService&lt;/code&gt; through PyObjC to trigger the system share sheet. The
share sheet requires user interaction for each recording. For 20+ memos, that’s 20+ clicks.&lt;/p&gt;

&lt;p&gt;The approach that actually works is the most hacky one: &lt;strong&gt;UI automation via the Accessibility
API&lt;/strong&gt;. It’s ugly, it breaks when Apple changes the UI, and it requires your terminal to have
Accessibility permission. But it reliably exports audio without corrupting sync state.&lt;/p&gt;

&lt;h3 id=&quot;before-you-start&quot;&gt;Before You Start&lt;/h3&gt;

&lt;p&gt;A quick checklist before running the scripts:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Grant Accessibility permission.&lt;/strong&gt; System Settings &amp;gt; Privacy &amp;amp; Security &amp;gt; Accessibility –
add your terminal app (Terminal.app, iTerm2, Warp, or whichever you use).&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Open Voice Memos&lt;/strong&gt; and select “All Recordings” in the sidebar.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Install prerequisites:&lt;/strong&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;brew install ffmpeg&lt;/code&gt; and
&lt;a href=&quot;https://docs.astral.sh/uv/&quot;&gt;install uv&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Verify Accessibility works&lt;/strong&gt; with a test command:&lt;/li&gt;
&lt;/ol&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;osascript &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;tell application &quot;System Events&quot; to get name of first process&apos;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If this returns a process name, you’re good. If it errors, check your Accessibility settings.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;step-1-listing-voice-memos&quot;&gt;Step 1: Listing Voice Memos&lt;/h2&gt;

&lt;p&gt;The first script reads the Voice Memos sidebar to discover all recordings. Voice Memos on macOS
Sequoia nests its sidebar buttons &lt;strong&gt;13 groups deep&lt;/strong&gt; inside the window hierarchy. I discovered
this number through trial and error with Accessibility Inspector done by AI Agent.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subprocess&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Run an AppleScript and return stdout.&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subprocess&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;osascript&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;capture_output&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;returncode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;osascript failed: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stderr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;list_voice_memos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Voice Memos nests its sidebar 13 groups deep
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;window 1&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;group 1 of &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;script&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; to activate
delay 0.5
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;System Events&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
    tell process &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        set theGroup to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        -- scroll to top so button 1 is the newest
        repeat 20 times
            perform action &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;AXScrollUpByPage&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; of theGroup
        end repeat
        delay 0.3

        set btnCount to count of buttons of theGroup
        set results to {{}}
        repeat with i from 1 to btnCount
            set btnRef to button i of theGroup
            set btnName to &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
            set btnDate to &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
            try
                set btnName to value of text field 1 of group 1 of btnRef
            end try
            try
                set btnDesc to description of btnRef
                if btnDesc contains &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;, &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; then
                    set AppleScript&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s text item delimiters to &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;, &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
                    set descParts to text items of btnDesc
                    set AppleScript&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s text item delimiters to &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
                    if (count of descParts) &amp;gt; 1 then
                        set btnDate to item 2 of descParts
                    end if
                end if
            end try
            if btnName is not &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; then
                copy (i as text) &amp;amp; tab &amp;amp; btnName &amp;amp; tab ¬
                    &amp;amp; btnDate to end of results
            end if
        end repeat
        set AppleScript&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s text item delimiters to linefeed
        return results as text
    end tell
end tell
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;items&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;parts&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\t&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;items&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;
                &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt;
                &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
                &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;date&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;items&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A few things to note:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Scrolling to top first&lt;/strong&gt; is essential. Without it, the button indices don’t correspond to
the visible items, and you get stale data from offscreen elements.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The recording date&lt;/strong&gt; comes from the button’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;description&lt;/code&gt; attribute, not a separate text
field. It’s embedded after the name, separated by a comma.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The 13-group depth&lt;/strong&gt; is specific to macOS Sequoia (15.x). Earlier macOS versions may have
different nesting depths. Use Accessibility Inspector (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/Applications/Utilities/&lt;/code&gt;) to check.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;step-2-exporting-audio-via-clipboard&quot;&gt;Step 2: Exporting Audio via Clipboard&lt;/h2&gt;

&lt;p&gt;The next challenge: getting the audio data out. There’s no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;osascript&lt;/code&gt; command to save a Voice
Memo to a file. But Cmd+C in Voice Memos copies the audio data to the clipboard – including
the raw M4A bytes.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;select_and_copy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ui_delay&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;float&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Find a recording by name, select it, and Cmd+C.&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;window 1&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;group 1 of &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\\\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;script&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; to activate
delay &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ui_delay&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;System Events&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
    tell process &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        set theGroup to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        set btnCount to count of buttons of theGroup
        repeat with i from 1 to btnCount
            set btnRef to button i of theGroup
            set btnName to &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
            try
                set btnName to value of text field 1 of group 1 of btnRef
            end try
            if btnName is &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; then
                perform action &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;AXPress&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; of btnRef
                delay &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ui_delay&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
                keystroke &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; using command down
                delay &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ui_delay&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
                return &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
            end if
        end repeat
    end tell
end tell
return &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;not-found&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Recording not found: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then save the clipboard audio data directly to an M4A file:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pathlib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;save_clipboard_audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;target_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Save M4A audio data from clipboard to file. Returns bytes.&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;target_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mkdir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parents&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exist_ok&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;target_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;script&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
set theData to the clipboard as «class M4A »
set thePath to POSIX file &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escaped&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
set theFile to open for access thePath with write permission
write theData to theFile
close access theFile
return &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;st_size&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The magic here is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;the clipboard as «class M4A »&lt;/code&gt;. AppleScript’s clipboard can hold typed data,
and Voice Memos puts the audio in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;M4A &lt;/code&gt; class. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;«class»&lt;/code&gt; syntax is AppleScript’s way
of referencing four-character type codes. This avoids the Finder entirely – no drag-and-drop
simulation, no save dialogs, just direct clipboard-to-file transfer.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;step-3-transcription-with-mlx-whisper&quot;&gt;Step 3: Transcription with mlx-whisper&lt;/h2&gt;

&lt;p&gt;Now for the AI part. &lt;a href=&quot;https://pypi.org/project/mlx-whisper/&quot;&gt;mlx-whisper&lt;/a&gt; is a port of OpenAI’s Whisper to Apple’s
&lt;a href=&quot;https://ml-explore.github.io/mlx/&quot;&gt;MLX framework&lt;/a&gt;, which runs inference directly on the Apple Silicon GPU and Neural Engine.
It’s significantly faster than running Whisper through PyTorch on the same hardware.&lt;/p&gt;

&lt;p&gt;The script uses &lt;a href=&quot;https://peps.python.org/pep-0723/&quot;&gt;PEP 723 inline metadata&lt;/a&gt;, so &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uv run&lt;/code&gt; handles dependencies
automatically – no virtual environment setup needed:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;#!/usr/bin/env python3
# /// script
# requires-python = &quot;&amp;gt;=3.11&quot;
# dependencies = [
#     &quot;mlx-whisper&amp;gt;=0.4&quot;,
# ]
# ///
&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sys&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pathlib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# HuggingFace repo ID (auto-downloaded on first run, ~1.6 GB)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MODEL&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;mlx-community/whisper-large-v3-turbo&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Transcribe audio using mlx-whisper on Apple Silicon.&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mlx_whisper&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mlx_whisper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;path_or_hf_repo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MODEL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;__main__&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Save this as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transcribe.py&lt;/code&gt; and run:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;uv run transcribe.py recording.m4a
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;On first run, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uv&lt;/code&gt; creates an isolated environment, installs &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mlx-whisper&lt;/code&gt; and its dependencies,
and downloads the model from HuggingFace. Subsequent runs reuse the cached environment and
model. The &lt;a href=&quot;https://peps.python.org/pep-0723/&quot;&gt;PEP 723&lt;/a&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;# /// script&lt;/code&gt; block tells &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uv&lt;/code&gt; exactly which dependencies are
needed – no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;requirements.txt&lt;/code&gt;, no &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pyproject.toml&lt;/code&gt;, just the script itself.&lt;/p&gt;

&lt;h3 id=&quot;why-mlx-whisper&quot;&gt;Why mlx-whisper?&lt;/h3&gt;

&lt;p&gt;Three reasons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Speed.&lt;/strong&gt; On an M4 Max, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;whisper-large-v3-turbo&lt;/code&gt; transcribes ~10x faster than real-time.
A 10-minute memo takes under 60 seconds. On an M1, it’s roughly 1x real-time.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Memory efficiency.&lt;/strong&gt; MLX uses unified memory, so the model loads directly into GPU-accessible
RAM. No copying between CPU and GPU memory.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Zero configuration.&lt;/strong&gt; No server to run. No Docker container. No GPU drivers. Import the
library, call one function, get text.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;the-model-whisper-large-v3-turbo&quot;&gt;The Model: whisper-large-v3-turbo&lt;/h3&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;whisper-large-v3-turbo&lt;/code&gt; model is a distilled version of Whisper large-v3 that’s ~4x faster
with minimal quality loss. It supports 99 languages out of the box. The MLX-optimized version
is hosted at &lt;a href=&quot;https://huggingface.co/mlx-community/whisper-large-v3-turbo&quot;&gt;mlx-community/whisper-large-v3-turbo&lt;/a&gt; on HuggingFace.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Property&lt;/th&gt;
      &lt;th&gt;Value&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Model size&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;~1.6 GB (FP16)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Parameters&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;809M&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Languages&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;99&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Architecture&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Encoder-decoder transformer (distilled)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Speed (M1)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;~1x real-time&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Speed (M4 Max)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;~10x real-time&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;alternative-transcription-with-ollama&quot;&gt;Alternative: Transcription with Ollama&lt;/h2&gt;

&lt;p&gt;If you prefer a server-based approach – or want to use an NVIDIA GPU – &lt;a href=&quot;https://ollama.com&quot;&gt;Ollama&lt;/a&gt; also
supports Whisper models. This is useful if you already run Ollama for LLM inference and want a
single tool for everything.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# /// script
# requires-python = &quot;&amp;gt;=3.11&quot;
# dependencies = [&quot;requests&amp;gt;=2.32&quot;]
# ///
&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mimetypes&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pathlib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requests&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;OLLAMA_URL&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;http://127.0.0.1:11434&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# Ollama uses short model names (no HuggingFace org prefix)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OLLAMA_MODEL&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;whisper-large-v3-turbo&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transcribe_with_ollama&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OLLAMA_MODEL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;300&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Transcribe audio via local Ollama Whisper API.&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mimetypes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;guess_type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; \
        &lt;span class=&quot;ow&quot;&gt;or&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;application/octet-stream&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;endpoint&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;/v1/audio/transcriptions&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;/api/transcribe&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;rb&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;resp&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requests&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;post&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
                    &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OLLAMA_URL&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;endpoint&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mime&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)},&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;resp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;resp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requests&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RequestException&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Transcription failed for &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio_file&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To set up Ollama for Whisper:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Install Ollama (https://ollama.com)&lt;/span&gt;
ollama pull whisper-large-v3-turbo
ollama serve  &lt;span class=&quot;c&quot;&gt;# Starts on port 11434&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The Ollama approach tries two API endpoints: the OpenAI-compatible &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/v1/audio/transcriptions&lt;/code&gt;
and Ollama’s native &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/api/transcribe&lt;/code&gt;. Note that Whisper support in Ollama is relatively new
and endpoint availability may vary between versions – I tested with Ollama 0.6.x. If one
endpoint fails, the script falls back to the other.&lt;/p&gt;

&lt;p&gt;It also supports a language fallback chain – try auto-detect first, then fall back to specific
languages:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;LANGUAGES&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;ru&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;en&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# auto -&amp;gt; Russian -&amp;gt; English
&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LANGUAGES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transcribe_with_ollama&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is particularly useful for multilingual recordings. I record memos in both English and
Russian, and the auto-detect works well for about 90% of cases. The fallback chain catches
the rest.&lt;/p&gt;

&lt;h3 id=&quot;mlx-whisper-vs-ollama-when-to-use-which&quot;&gt;mlx-whisper vs Ollama: When to Use Which&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Factor&lt;/th&gt;
      &lt;th&gt;mlx-whisper&lt;/th&gt;
      &lt;th&gt;Ollama&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Setup&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uv run&lt;/code&gt; – zero config&lt;/td&gt;
      &lt;td&gt;Install Ollama + pull model&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Platform&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Apple Silicon only&lt;/td&gt;
      &lt;td&gt;macOS, Linux, Windows&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;GPU&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Apple Neural Engine / GPU&lt;/td&gt;
      &lt;td&gt;Apple GPU, NVIDIA CUDA&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Server required&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;No&lt;/td&gt;
      &lt;td&gt;Yes (ollama serve)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Speed (M4 Max)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;~10x real-time&lt;/td&gt;
      &lt;td&gt;~6-8x real-time&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;MacBook-only pipelines&lt;/td&gt;
      &lt;td&gt;Multi-platform, existing Ollama&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;My recommendation: use mlx-whisper for a pure-Mac setup. Use Ollama if you already run it for
LLM inference or need NVIDIA GPU support.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;putting-it-all-together&quot;&gt;Putting It All Together&lt;/h2&gt;

&lt;p&gt;Here’s the complete pipeline as a single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;uv run&lt;/code&gt;-able script. It lists memos, exports audio,
and transcribes – all in one pass:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;#!/usr/bin/env python3
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Voice memo transcription pipeline. Run: uv run pipeline.py&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# /// script
# requires-python = &quot;&amp;gt;=3.11&quot;
# dependencies = [&quot;mlx-whisper&amp;gt;=0.4&quot;]
# ///
&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subprocess&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pathlib&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;OUTPUT_DIR&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;voice-memos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;MODEL&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;mlx-community/whisper-large-v3-turbo&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subprocess&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;osascript&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;capture_output&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;timeout&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;60&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;returncode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stderr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;proc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdout&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;list_memos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;window 1&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;group 1 of &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;script&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; to activate
delay 0.5
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;System Events&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
    tell process &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        set theGroup to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        repeat 20 times
            perform action &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;AXScrollUpByPage&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; of theGroup
        end repeat
        delay 0.3
        set btnCount to count of buttons of theGroup
        set results to {{}}
        repeat with i from 1 to btnCount
            set btnRef to button i of theGroup
            set btnName to &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
            try
                set btnName to value of text field 1 of ¬
                    group 1 of btnRef
            end try
            if btnName is not &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; then
                copy (i as text) &amp;amp; tab &amp;amp; btnName to end of results
            end if
        end repeat
        set AppleScript&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;s text item delimiters to linefeed
        return results as text
    end tell
end tell
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;if &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\t&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;export_memo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;window 1&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;group 1 of &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;escaped_name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; to activate
delay 0.5
tell application &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;System Events&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
    tell process &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;VoiceMemos&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        set theGroup to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;group_ref&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
        set btnCount to count of buttons of theGroup
        repeat with i from 1 to btnCount
            set btnRef to button i of theGroup
            try
                if value of text field 1 of group 1 of btnRef ¬
                    is &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escaped_name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; then
                    perform action &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;AXPress&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; of btnRef
                    delay 1
                    keystroke &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; using command down
                    delay 0.5
                    exit repeat
                end if
            end try
        end repeat
    end tell
end tell
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OUTPUT_DIR&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.m4a&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mkdir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parents&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exist_ok&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;escaped_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\\&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;run_applescript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
set theData to the clipboard as &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\u00AB&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;class M4A &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\u00BB&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
set f to open for access (POSIX file &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;escaped_path&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;) ¬
    with write permission
write theData to f
close access f
&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&apos;&apos;&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mlx_whisper&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mlx_whisper&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path_or_hf_repo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MODEL&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;OUTPUT_DIR&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mkdir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exist_ok&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;memos&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;list_memos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Found &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;memos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; voice memo(s)&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;memo&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;memos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;memo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;OUTPUT_DIR&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.m4a&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;transcript&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;with_suffix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.md&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;c1&quot;&gt;# Skip if already transcribed
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;transcript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
            &lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;  Skip (exists): &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;

        &lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;  Export: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;export_memo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;  Transcribe: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# Simple output; the production version adds YAML frontmatter
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;transcript&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;write_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;# &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\n&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;  Done: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; chars&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;__main__&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Run it:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;uv run pipeline.py
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;First run downloads the Whisper model (~1.6 GB). After that, it processes your entire voice memo
library incrementally – skipping recordings that already have transcripts.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;making-it-incremental&quot;&gt;Making It Incremental&lt;/h2&gt;

&lt;p&gt;The production version of this pipeline adds proper state management. A &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;state.json&lt;/code&gt; file tracks
which recordings have been fetched and transcribed, when, and with which model:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;version&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;target_map&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;voice-memo-2026-02-13-recording-26&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;voice_memo_name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Recording 26&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;voice_memo_recording_date_iso&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;2026-02-13&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;target_audio&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;voice-memo-2026-02-13.../Recording 26.m4a&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;status&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;success&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;records&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;kind&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;transcribe&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;audio&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;voice-memo-2026.../Recording 26.m4a&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;model&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;whisper-large-v3-turbo&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;language&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;en&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;status&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;success&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
      &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;transcript_chars&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8889&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This enables:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Skip-if-done&lt;/strong&gt; – don’t re-transcribe recordings that haven’t changed.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Date cutoffs&lt;/strong&gt; – only process recordings from the last N days or after a specific date.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Audit trail&lt;/strong&gt; – see exactly when each recording was processed and with which model.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Model upgrades&lt;/strong&gt; – when a better Whisper version comes out, force re-transcription with
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--force&lt;/code&gt; and compare results.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-output-searchable-markdown&quot;&gt;The Output: Searchable Markdown&lt;/h2&gt;

&lt;p&gt;Each transcript is saved as a markdown file with YAML frontmatter, compatible with
&lt;a href=&quot;https://obsidian.md&quot;&gt;Obsidian&lt;/a&gt;, Logseq, or any markdown-based knowledge system:&lt;/p&gt;

&lt;div class=&quot;language-markdown highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;created&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;2026-02-13&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;[[Recording&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;26.m4a]]&quot;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;mlx-community/whisper-large-v3-turbo&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;voice-memo&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;transcript&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;

&lt;span class=&quot;gh&quot;&gt;# Recording 26&lt;/span&gt;

The transcript text appears here. Whisper adds punctuation
and capitalization automatically, which makes the output
surprisingly readable for raw speech-to-text.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now you can &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grep&lt;/code&gt; your voice memos. Search for that product idea from three weeks ago. Find
the meeting where someone mentioned that deadline. Your voice recordings become part of your
searchable knowledge base.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;practical-tips&quot;&gt;Practical Tips&lt;/h2&gt;

&lt;p&gt;After running this pipeline on 20+ recordings, here are the things I learned:&lt;/p&gt;

&lt;h3 id=&quot;ui-automation-is-fragile&quot;&gt;UI Automation Is Fragile&lt;/h3&gt;

&lt;p&gt;AppleScript UI automation breaks when macOS updates change the accessibility tree. The 13-group
depth for Voice Memos is specific to macOS Sequoia. Keep Accessibility Inspector handy for
debugging – it shows the exact hierarchy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tip:&lt;/strong&gt; Add configurable delays between UI actions. On slower machines or when the system is
busy, the default 0.5-second delay isn’t enough. The production script uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--ui-delay&lt;/code&gt; to
tune this.&lt;/p&gt;

&lt;h3 id=&quot;when-things-go-wrong&quot;&gt;When Things Go Wrong&lt;/h3&gt;

&lt;p&gt;A few failure modes I’ve encountered and how the production script handles them:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Clipboard contains text, not audio.&lt;/strong&gt; If the memo selection fails, Cmd+C copies the name
as text. The script checks whether the clipboard matches the expected memo name before saving.
If it does, that means the audio wasn’t copied – retry.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Accessibility permission revoked.&lt;/strong&gt; macOS occasionally resets Accessibility permissions
after system updates. The script will fail with a cryptic AppleScript error. Check System
Settings first.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Duplicate memo names.&lt;/strong&gt; Voice Memos allows multiple recordings with the same name. The
production script includes the recording date in the folder name
(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;voice-memo-2026-02-13-recording-26/&lt;/code&gt;) to avoid collisions.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Model download fails.&lt;/strong&gt; The first &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mlx-whisper&lt;/code&gt; run downloads ~1.6 GB from HuggingFace.
If the download is interrupted, delete the HuggingFace cache
(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;~/.cache/huggingface/hub/models--mlx-community--whisper-large-v3-turbo/&lt;/code&gt;) and try again.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;multi-language-support&quot;&gt;Multi-Language Support&lt;/h3&gt;

&lt;p&gt;Whisper handles language detection automatically, but it’s not perfect. For recordings that mix
languages (common if you’re multilingual), a fallback chain helps:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;LANGUAGES&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;ru&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;en&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# auto -&amp;gt; Russian -&amp;gt; English
&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LANGUAGES&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;transcribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;audio&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;except&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Try auto-detect first. If it fails or produces garbage, force a specific language. This
catches about 95% of cases in my experience.&lt;/p&gt;

&lt;h3 id=&quot;run-it-overnight&quot;&gt;Run It Overnight&lt;/h3&gt;

&lt;p&gt;The first run through a large library takes time. Whisper large-v3-turbo on an M1 processes
audio at roughly 1x real-time. If you have 10 hours of recordings, that’s 10 hours of
processing. Queue it up before bed:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# The production script supports --since for date filtering:&lt;/span&gt;
uv run scripts/fetch_voice_memos.py &lt;span class=&quot;nt&quot;&gt;--since&lt;/span&gt; 2025-01-01 2&amp;gt;&amp;amp;1 | &lt;span class=&quot;nb&quot;&gt;tee &lt;/span&gt;transcription.log
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Subsequent runs only process new recordings and finish in seconds.&lt;/p&gt;

&lt;h3 id=&quot;keep-the-original-audio&quot;&gt;Keep the Original Audio&lt;/h3&gt;

&lt;p&gt;Always keep the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.m4a&lt;/code&gt; files alongside the transcripts. Whisper is good, but not perfect –
especially for technical jargon, proper nouns, and domain-specific vocabulary. Having the
original audio lets you spot-check and correct errors.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;what-this-enables&quot;&gt;What This Enables&lt;/h2&gt;

&lt;p&gt;Once your voice memos are searchable text, interesting workflows emerge:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Daily review.&lt;/strong&gt; I skim yesterday’s transcripts over morning coffee. Last week, I recovered
a product architecture idea I’d completely forgotten about from a 3-minute walk recording.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Knowledge base.&lt;/strong&gt; Import transcripts into Obsidian and link them to projects, people, ideas.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;LLM summarization.&lt;/strong&gt; Feed transcripts to a local LLM (via Ollama) for summaries, action
items, or topic extraction. A 30-minute brainstorm session becomes a 10-line summary in
seconds.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Semantic search.&lt;/strong&gt; Index transcripts in a vector database (pgvector, Chroma) for
similarity search across your entire recording history.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pipeline from voice memo to searchable, AI-processable text is the starting point. What
you build on top is where it gets interesting.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;the-bigger-picture-building-a-personal-knowledge-base&quot;&gt;The Bigger Picture: Building a Personal Knowledge Base&lt;/h2&gt;

&lt;p&gt;This voice memo pipeline is one piece of a larger system I’m building – a fully local personal
knowledge base. The transcripts land in an &lt;a href=&quot;https://obsidian.md&quot;&gt;Obsidian&lt;/a&gt; vault alongside meeting notes,
bookmarks, annotated PDFs, and research clippings. Multiple scripts and AI models work together
to manage, update, and connect this growing collection.&lt;/p&gt;

&lt;p&gt;The architecture behind it:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Voice memos&lt;/strong&gt; get transcribed by Whisper (this post) and stored as markdown.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;A RAG pipeline&lt;/strong&gt; indexes all markdown files into a &lt;a href=&quot;https://github.com/pgvector/pgvector&quot;&gt;pgvector&lt;/a&gt; database – chunked,
embedded, and searchable by semantic similarity. When I need to find “that idea about caching
from last month,” I query the RAG system instead of grepping through hundreds of files.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Multiple AI scripts&lt;/strong&gt; maintain the knowledge base: one summarizes long transcripts, another
extracts action items, a third generates topic tags and cross-links between related notes.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Everything runs locally&lt;/strong&gt; on the MacBook. For the broader knowledge base I use Ollama,
which handles both Whisper transcription and LLM inference for summarization under one
server. For the standalone voice memo pipeline in this article, mlx-whisper is the simpler
choice. The vector database runs in a Docker container with pgvector.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core insight: &lt;strong&gt;AI is most useful when it automates your personal routines, not just code
generation.&lt;/strong&gt; Transcription is a mundane task that took zero creativity but consumed real time.
Now it runs in the background while I sleep. Summarization, tagging, and cross-linking happen
automatically. The knowledge base grows and organizes itself.&lt;/p&gt;

&lt;p&gt;That’s the kind of Local AI use case that justifies the hardware investment – not benchmarks,
not leaderboards, but saving 30 minutes a day on something you actually do.&lt;/p&gt;

&lt;p&gt;I’m planning to write more about this system. Next up: either the &lt;strong&gt;RAG pipeline with pgvector&lt;/strong&gt;
or the &lt;strong&gt;Obsidian integration and auto-tagging&lt;/strong&gt;. If one of these interests you more – &lt;strong&gt;let
me know&lt;/strong&gt;. Ping me on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or
&lt;a href=&quot;https://x.com/jonnyzzz&quot;&gt;Twitter/X&lt;/a&gt;. Your questions genuinely help me prioritize what to
write next.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/openai/whisper&quot;&gt;OpenAI Whisper&lt;/a&gt; – the original speech recognition model&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://pypi.org/project/mlx-whisper/&quot;&gt;mlx-whisper on PyPI&lt;/a&gt; – Apple Silicon optimized Whisper&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://ml-explore.github.io/mlx/&quot;&gt;MLX framework&lt;/a&gt; – Apple’s machine learning framework for Apple Silicon&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://huggingface.co/mlx-community/whisper-large-v3-turbo&quot;&gt;mlx-community/whisper-large-v3-turbo&lt;/a&gt; – the HuggingFace model&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://ollama.com&quot;&gt;Ollama&lt;/a&gt; – local LLM and Whisper inference server&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.astral.sh/uv/&quot;&gt;uv&lt;/a&gt; – fast Python package manager by Astral&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://peps.python.org/pep-0723/&quot;&gt;PEP 723&lt;/a&gt; – inline script metadata for self-contained Python scripts&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://obsidian.md&quot;&gt;Obsidian&lt;/a&gt; – markdown-based knowledge management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Have questions about running this on your hardware? Found a bug? Reach out on
&lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or
&lt;a href=&quot;https://x.com/jonnyzzz&quot;&gt;Twitter/X&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="LocalAI" />
  
    <category term="python" />
  
    <category term="macos" />
  
    <category term="whisper" />
  
    <category term="ai-coding" />
  
    <category term="automation" />
  
    <category term="apple-silicon" />
  
    <category term="voice-memos" />
  
    <summary type="html">Your MacBook can transcribe voice memos entirely offline using Whisper and Apple Silicon. No cloud APIs, no subscriptions, no data leaving your machine. Here&apos;s how I built a fully local pipeline with Python, AppleScript, and mlx-whisper.</summary>
  
  </entry>
  
  <entry>
    <title type="html">MCP Steroid Project Assessment: 75 Days, 1300+ Commits, One Plugin</title>
    <link href="https://jonnyzzz.com/blog/2026/02/23/mcp-steroid-project-assessment/" rel="alternate" type="text/html" title="MCP Steroid Project Assessment: 75 Days, 1300+ Commits, One Plugin" />
    <published>2026-02-23T00:00:00+00:00</published>
    <updated>2026-02-23T00:00:00+00:00</updated>
    <id>/blog/2026/02/23/mcp-steroid-project-assessment</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/02/23/mcp-steroid-project-assessment/">&lt;p&gt;I decided to take a step back and look at what has been built so far.&lt;/p&gt;

&lt;p&gt;MCP Steroid started on December 10, 2025. It is an IntelliJ Platform plugin that exposes
the full IDE JVM runtime to AI Agents via the Model Context Protocol. Instead of reading and
writing files, agents can compile code, run inspections, trigger refactorings, use the
debugger, and take screenshots – all through one MCP tool that executes Kotlin inside
the IDE’s JVM – a much more flexible, powerful, and minimalistic API an AI Agent can use.&lt;/p&gt;

&lt;p&gt;75 days later, the numbers tell an interesting story.&lt;/p&gt;

&lt;h2 id=&quot;the-numbers&quot;&gt;The numbers&lt;/h2&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Value&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Total commits&lt;/td&gt;
      &lt;td&gt;1,306&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Active development days&lt;/td&gt;
      &lt;td&gt;52 of 75 (69% utilization)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Avg commits per active day&lt;/td&gt;
      &lt;td&gt;~25&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Production Kotlin LOC&lt;/td&gt;
      &lt;td&gt;27,505&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Test Kotlin LOC&lt;/td&gt;
      &lt;td&gt;110,482&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Test-to-production ratio&lt;/td&gt;
      &lt;td&gt;4:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Gradle submodules&lt;/td&gt;
      &lt;td&gt;8&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Releases&lt;/td&gt;
      &lt;td&gt;3 (0.87.0, 0.88.0, 0.89.0)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The 4:1 test-to-production ratio is not an accident. The Docker-based integration tests
launch a full IntelliJ IDE in a container, connect real AI Agents (Claude Code, Codex,
Gemini CLI), and verify end-to-end MCP workflows. Arena tests run curated project
benchmarks comparing agent performance with and without the plugin.&lt;/p&gt;

&lt;h2 id=&quot;architecture-highlights&quot;&gt;Architecture highlights&lt;/h2&gt;

&lt;p&gt;The core execution pipeline follows a two-phase design: compile first with an external
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kotlinc&lt;/code&gt; process, then run the compiled code inside the IDE’s JVM with coroutine-based
timeout enforcement. This separation means agents get compilation errors immediately
without waiting for execution.&lt;/p&gt;

&lt;p&gt;A few technical decisions that turned out well:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Modal dialog race detection&lt;/strong&gt; – Kotlin &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;select{}&lt;/code&gt; races script execution against
IDE dialog appearance. If a dialog pops up mid-execution, the script cancels and
a screenshot goes back to the agent.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;External Kotlin compiler isolation&lt;/strong&gt; – agent scripts cannot starve the IDE’s own
Kotlin daemon. The plugin recovers automatically when the daemon dies.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Transport-agnostic MCP core&lt;/strong&gt; – the JSON-RPC dispatcher has zero HTTP dependencies.
The Ktor transport layer is pluggable.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Append-only execution storage&lt;/strong&gt; – every script, compilation output, and result is
stored as an immutable audit trail under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.idea/mcp-steroid/&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;quality-scorecard&quot;&gt;Quality scorecard&lt;/h2&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Score&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Code Organization&lt;/td&gt;
      &lt;td&gt;9/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Error Handling&lt;/td&gt;
      &lt;td&gt;9/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Test Coverage&lt;/td&gt;
      &lt;td&gt;8.5/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Architectural Sophistication&lt;/td&gt;
      &lt;td&gt;9/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Coroutine Patterns&lt;/td&gt;
      &lt;td&gt;9.5/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;IntelliJ Platform Integration&lt;/td&gt;
      &lt;td&gt;9/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Extensibility&lt;/td&gt;
      &lt;td&gt;9/10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Overall&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;8.8/10&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Zero &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;runBlocking()&lt;/code&gt; in production code. Proper &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;readAction{}&lt;/code&gt;/&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;writeAction{}&lt;/code&gt; threading.
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ProcessCanceledException&lt;/code&gt; always rethrown. IntelliJ service model used correctly throughout.&lt;/p&gt;

&lt;h2 id=&quot;what-makes-this-unique&quot;&gt;What makes this unique&lt;/h2&gt;

&lt;p&gt;MCP Steroid is the only product that gives AI Agents visual IDE access – screenshots
with component trees, keyboard and mouse input dispatch, OCR integration. Agents operate
at the same level of semantic understanding that the IDE itself uses: type-aware symbol
search, real inspections, IDE-native refactorings, live test results.&lt;/p&gt;

&lt;p&gt;The plugin works with Claude Code, OpenAI Codex, and Google Gemini CLI today.
Human-in-the-loop review gates are configurable per project.&lt;/p&gt;

&lt;h2 id=&quot;where-it-is-heading&quot;&gt;Where it is heading&lt;/h2&gt;

&lt;p&gt;Multi-IDE support is already underway (GoLand, WebStorm). The NPX proxy aggregates
multiple IDE instances. The arena benchmarking framework measures agent quality across
curated project scenarios. Enterprise deployment works via a custom plugin repository.&lt;/p&gt;

&lt;h2 id=&quot;update-v0890-released&quot;&gt;Update: v0.89.0 released&lt;/h2&gt;

&lt;p&gt;Three days after this assessment, version 0.89.0 shipped – 361 commits since 0.88.0.&lt;/p&gt;

&lt;p&gt;Key highlights:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Codex agent output support&lt;/strong&gt; – full NDJSON format handling for OpenAI Codex,
including &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mcp_tool_call&lt;/code&gt; items, structured results, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reasoning&lt;/code&gt; blocks.
Raw and decoded agent logs are saved per prompt run.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Prompt system overhaul&lt;/strong&gt; – all prompt articles migrated to single-file &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.md&lt;/code&gt;
format with auto-generated TOC and per-article read tests.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;DPAIA arena testing&lt;/strong&gt; – Docker-based A/B comparison framework that measures
agent effectiveness with and without MCP Steroid on curated project scenarios.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Settings UI&lt;/strong&gt; – new project settings page under Tools &amp;gt; MCP Steroid with
copy buttons and structured connection info.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Claude Code 2.1.x compatibility&lt;/strong&gt; – handles both old streaming and new
structured event formats.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Stability fixes&lt;/strong&gt; – resolved an 8-minute startup deadlock in Docker containers,
fixed onboarding dialogs blocking test runs, and several NPE fixes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full release notes: &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/releases/0.89.0/&quot;&gt;v0.89.0 on the MCP Steroid site&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;ai-agents-can-debug-now&quot;&gt;AI Agents can debug now&lt;/h2&gt;

&lt;p&gt;The debugger integration is what sets MCP Steroid apart from every other AI coding
tool. No other product gives AI Agents access to breakpoints, step-over, variable
inspection, and expression evaluation inside a real IDE debugger.&lt;/p&gt;

&lt;p&gt;Here is Codex debugging an application in IntelliJ IDEA, powered by MCP Steroid:&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/HtDDNyAoLak&quot; title=&quot;Codex Debugs in IntelliJ IDEA&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;And a shorter demo showing the debugger workflow – setting breakpoints, stepping
through code, evaluating expressions:&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/8MjogrpfXLU&quot; title=&quot;MCP Steroid &amp;amp; IntelliJ Debugger&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;try-mcp-steroid&quot;&gt;Try MCP Steroid&lt;/h2&gt;

&lt;p&gt;If you are building with AI Agents and want to give them the full IDE – not just
file access, but compilation, inspections, refactorings, debugging, and visual
understanding – give MCP Steroid a try.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get started:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/getting-started/&quot;&gt;Install the plugin&lt;/a&gt; – works with Claude Code, Codex, Gemini CLI,
Cursor, and any MCP-compatible agent&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/releases/0.89.0/&quot;&gt;Download v0.89.0&lt;/a&gt; or add the
&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/updatePlugins.xml&quot;&gt;custom plugin repository&lt;/a&gt;
for automatic updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Support the development:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/sponsors/jonnyzzz&quot;&gt;Sponsor on GitHub&lt;/a&gt; – this project needs
funding to continue development, testing, and infrastructure&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://join.slack.com/t/mcp-steroid/shared_invite/zt-3p3oq91kx-BXJng8GSXveqncFVYWUcpQ&quot;&gt;Join the Slack community&lt;/a&gt; to discuss ideas and report issues&lt;/li&gt;
  &lt;li&gt;Star the &lt;a href=&quot;https://github.com/jonnyzzz/mcp-steroid&quot;&gt;GitHub repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Read more:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://mcp-steroid.jonnyzzz.com/docs/project-assessment-2026-02-22/&quot;&gt;Full project assessment&lt;/a&gt; with architecture analysis, commit theme
breakdown, and competitive positioning&lt;/li&gt;
&lt;/ul&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="mcp" />
  
    <category term="mcp-steroid" />
  
    <category term="intellij" />
  
    <category term="ai-agents" />
  
    <category term="ai-coding" />
  
    <category term="developer-experience" />
  
    <category term="plugin-development" />
  
    <summary type="html">I ran a full development assessment on MCP Steroid -- the IntelliJ plugin that gives AI Agents direct access to the IDE runtime. 75 days of development, 1,306 commits, 28K lines of production code, and a 4:1 test-to-production ratio. Here is what the numbers say about the project and where it is heading.</summary>
  
  </entry>
  
  <entry>
    <title type="html">Testing Your MCP Server with Real AI Agents in Docker</title>
    <link href="https://jonnyzzz.com/blog/2026/02/21/testing-mcp-server-with-ai-agents/" rel="alternate" type="text/html" title="Testing Your MCP Server with Real AI Agents in Docker" />
    <published>2026-02-21T00:00:00+00:00</published>
    <updated>2026-02-21T00:00:00+00:00</updated>
    <id>/blog/2026/02/21/testing-mcp-server-with-ai-agents</id>
    <content type="html" xml:base="https://jonnyzzz.com/blog/2026/02/21/testing-mcp-server-with-ai-agents/">&lt;p&gt;The only way to know your MCP server actually works is to test it with a real AI Agent.&lt;/p&gt;

&lt;p&gt;Unit tests can tell you whether a JSON schema is valid.
Mock tests can verify your handler returns the right bytes, and can deviate from the logic.
But none of them tell you whether an AI Agent will actually discover your tool,
call it correctly, and do something useful with the result.&lt;/p&gt;

&lt;p&gt;The core insight that drove our testing approach:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Tests should be assertions about the agentic loop itself&lt;/strong&gt; – not just about
individual functions. A test that passes only when a real agent, starting from
a cold container, successfully discovers and calls your tool is the only test
that proves the contract is real.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We built this for &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com&quot;&gt;mcp-steroid&lt;/a&gt;, an MCP server that gives AI Agents deep
access to IntelliJ IDEA. The server has dozens of tools. Each agent CLI handles
MCP registration, tool discovery, and streaming output differently. The only way to catch
regressions was to test with real agents.&lt;/p&gt;

&lt;p&gt;So we did.&lt;/p&gt;

&lt;h2 id=&quot;the-architecture&quot;&gt;The Architecture&lt;/h2&gt;

&lt;p&gt;The test setup has three moving parts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;The MCP server&lt;/strong&gt; runs inside the IntelliJ test process on the host machine,
listening on random port, say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.0.0.0:17820&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;The AI Agent&lt;/strong&gt; runs inside a Docker container&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Docker networking&lt;/strong&gt; connects them: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;host.docker.internal&lt;/code&gt; resolves to the host from inside
any container&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the test starts, it binds the MCP server to all interfaces (not just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;localhost&lt;/code&gt;),
so it is reachable from the Docker network. The container gets started with
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--add-host=host.docker.internal:host-gateway&lt;/code&gt;, which maps that hostname to the
host gateway address. The test then translates the MCP URL from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;localhost:17820&lt;/code&gt;
to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;http://host.docker.internal:17820&lt;/code&gt; before handing it to the agent.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;┌──────────────────────────────────────────────────────┐
│  Host machine                                        │
│                                                      │
│  JVM test process                                    │
│  ┌──────────────────────────────────────────────┐    │
│  │  IntelliJ test framework                     │    │
│  │  MCP server: 0.0.0.0:17820                   │    │
│  └──────────────────────────────────────────────┘    │
│                              ▲                       │
│                              │ host.docker.internal  │
└──────────────────────────────┼───────────────────────┘
                               │
┌──────────────────────────────┼───────────────────────┐
│  Docker container            │                       │
│                              │ HTTP/SSE              │
│  AI Agent CLI ───────────────┘                       │
│  (Claude / Codex / Gemini)                           │
└──────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The test class wires this together before each test:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;setUp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Bind server to all interfaces, not just localhost&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mcp.steroid.server.host&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;0.0.0.0&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;nc&quot;&gt;System&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;mcp.steroid.server.port&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;17820&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;setUp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolveDockerUrl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;mcpUrl&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;SteroidsMcpServer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getSseUrl&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Replace localhost with the Docker-accessible hostname&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mcpUrl&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;localhost&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;host.docker.internal&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;127.0.0.1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;host.docker.internal&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Each agent session lives inside a Docker container. The container starts once, receives
the MCP registration command, and then runs prompts via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker exec&lt;/code&gt;. The container
is reused across calls within a single test.&lt;/p&gt;

&lt;p&gt;Basically, each AI Agent container starts with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sleep infinity&lt;/code&gt; command as the entrypoint, 
and we use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker exec&lt;/code&gt; commands to start various processes inside, even IntelliJ IDEA or Git.&lt;/p&gt;

&lt;h3 id=&quot;api-keys&quot;&gt;API Keys&lt;/h3&gt;

&lt;p&gt;The agent containers need API keys to talk to their respective model providers.
We pass them as environment variables when running commands inside the container.
The test harness reads the key from the environment and creates environment
variables internally – &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ANTHROPIC_API_KEY&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OPENAI_API_KEY&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GEMINI_API_KEY&lt;/code&gt;.
The key is injected only into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker exec&lt;/code&gt; call, never stored in the image.
We also redact the key from all log output so it doesn’t leak into CI logs.&lt;/p&gt;

&lt;h2 id=&quot;claude-code----verbose-or-you-are-flying-blind&quot;&gt;Claude Code – &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--verbose&lt;/code&gt; or You Are Flying Blind&lt;/h2&gt;

&lt;h3 id=&quot;the-dockerfiles&quot;&gt;The Dockerfiles&lt;/h3&gt;

&lt;p&gt;All three containers follow the same structure: Node.js 20 base, install the agent CLI,
non-root user. The only difference is the npm package name.&lt;/p&gt;

&lt;div class=&quot;language-dockerfile highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; debian:bookworm-slim&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# ── Node.js 20 via NodeSource&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;apt-get update &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    apt-get &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-install-recommends&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;        curl ca-certificates gnupg &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    &lt;span class=&quot;nb&quot;&gt;mkdir&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; /etc/apt/keyrings &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    curl &lt;span class=&quot;nt&quot;&gt;-fsSL&lt;/span&gt; https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;        gpg &lt;span class=&quot;nt&quot;&gt;--dearmor&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; /etc/apt/keyrings/nodesource.gpg &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    &lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;deb [signed-by=/etc/apt/keyrings/nodesource.gpg] &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;          https://deb.nodesource.com/node_20.x nodistro main&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;        &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /etc/apt/sources.list.d/nodesource.list &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    apt-get update &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; nodejs &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    &lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# ── Agent CLI -- swap this line per agent&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;npm &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-g&lt;/span&gt; @anthropic-ai/claude-code   &lt;span class=&quot;c&quot;&gt;# Claude Code&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;npm &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-g&lt;/span&gt; @openai/codex               &lt;span class=&quot;c&quot;&gt;# Codex&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;npm &lt;span class=&quot;nb&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-g&lt;/span&gt; @google/gemini-cli          &lt;span class=&quot;c&quot;&gt;# Gemini&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# ── Non-root user for isolation&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;useradd &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-s&lt;/span&gt; /bin/bash agent
&lt;span class=&quot;k&quot;&gt;USER&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; agent&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;WORKDIR&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; /home/agent&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# ── Keep the container alive; commands run via docker exec&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;CMD&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; [&quot;sleep&quot;, &quot;infinity&quot;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The container starts with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sleep infinity&lt;/code&gt; and stays alive.
All commands run via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker exec&lt;/code&gt; – the container is never restarted between calls.&lt;/p&gt;

&lt;h3 id=&quot;mcp-registration&quot;&gt;MCP Registration&lt;/h3&gt;

&lt;p&gt;Before running any prompt, we register the MCP server:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;claude mcp add --transport http intellij http://host.docker.internal:17820
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;running-a-prompt&quot;&gt;Running a Prompt&lt;/h3&gt;

&lt;p&gt;Claude Code has a non-interactive mode (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-p&lt;/code&gt;) with streaming JSON output:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;claude \
  --permission-mode bypassPermissions \
  --tools default \
  --input-format text \
  --output-format stream-json \
  --verbose \
  -p &quot;List the MCP tools and call steroid_list_projects&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--output-format stream-json --verbose&lt;/code&gt; flags are essential.
Without them, Claude only emits the final text response after the agent finishes.
We want to see if the Claude process is alive, and thus the verbose output with
progress messages is very essential.
With them, every event streams to stdout as NDJSON (one JSON object per line)
in real time: assistant messages, tool calls, tool results, token usage.
This is what makes debugging tests tractable – you see exactly what the agent did.&lt;/p&gt;

&lt;h3 id=&quot;parsing-the-json-output&quot;&gt;Parsing the JSON Output&lt;/h3&gt;

&lt;p&gt;Claude Code’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stream-json&lt;/code&gt; format is event-driven. Tool calls are nested inside
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;assistant&lt;/code&gt; message events (in the current format), and the final &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;result&lt;/code&gt; event
carries cost and timing:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;assistant&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;role&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;assistant&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;content&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:[{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;I&apos;ll list...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;assistant&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;content&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:[{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tool_use&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;steroid_list_projects&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;input&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{}}]}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;user&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;content&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:[{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tool_result&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tool_use_id&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;content&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;result&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;cost_usd&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0042&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;duration_ms&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;num_turns&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We parse this with a streaming NDJSON loop that routes on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;type&lt;/code&gt;.
Tool calls render as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;gt;&amp;gt; steroid_list_projects&lt;/code&gt;, tool results as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;&amp;lt; steroid_list_projects&lt;/code&gt;,
assistant text prints as-is, and cost appears at the end.
We never buffer the full output – lines are processed as they arrive.&lt;/p&gt;

&lt;h2 id=&quot;codex--stderr-will-fool-you&quot;&gt;Codex – stderr Will Fool You&lt;/h2&gt;

&lt;h3 id=&quot;mcp-registration-1&quot;&gt;MCP Registration&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;codex mcp add intellij --url http://host.docker.internal:17820
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;running-a-prompt-1&quot;&gt;Running a Prompt&lt;/h3&gt;

&lt;p&gt;Codex uses the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;exec&lt;/code&gt; subcommand for non-interactive batch mode:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;codex exec \
  --dangerously-bypass-approvals-and-sandbox \
  --skip-git-repo-check \
  --json \
  &quot;List the MCP tools and call steroid_list_projects&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--json&lt;/code&gt; flag makes Codex emit NDJSON to stdout.
Without it, you get the interactive terminal UI – not useful inside a test.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--dangerously-bypass-approvals-and-sandbox&lt;/code&gt; flag disables safety confirmation
prompts so the agent can run without blocking on human input.&lt;/p&gt;

&lt;h3 id=&quot;parsing-the-json-output-1&quot;&gt;Parsing the JSON Output&lt;/h3&gt;

&lt;p&gt;Codex’s format uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;item&lt;/code&gt; events:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;item.started&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;item&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;mcp_tool_call&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;steroid_list_projects&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;item.completed&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;item&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;agent_message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;text&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;I found the following tools...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;item.completed&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;item&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;mcp_tool_call&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;steroid_list_projects&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;output&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;turn.completed&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;usage&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;input_tokens&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;400&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;output_tokens&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We route on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;item.type&lt;/code&gt;. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mcp_tool_call&lt;/code&gt; items show us tool invocations.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;agent_message&lt;/code&gt; items (with a flat &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;text&lt;/code&gt; field) give us the assistant’s text.
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;turn.completed&lt;/code&gt; shows token usage.&lt;/p&gt;

&lt;p&gt;One thing that burned us early: &lt;strong&gt;Codex writes to stderr in some modes and stdout in others&lt;/strong&gt;.
In &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--json&lt;/code&gt; mode, structured events go to stdout and plain diagnostic messages go to stderr.
We spent a week thinking our output filter was broken before we realized we were reading
the wrong stream. We now capture both streams separately and only parse stdout as NDJSON.&lt;/p&gt;

&lt;h2 id=&quot;gemini--exit-code-137-is-actually-fine&quot;&gt;Gemini – Exit Code 137 Is Actually Fine&lt;/h2&gt;

&lt;h3 id=&quot;mcp-registration-2&quot;&gt;MCP Registration&lt;/h3&gt;

&lt;p&gt;Gemini’s registration syntax is the most verbose. It requires &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--type http&lt;/code&gt;,
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--scope user&lt;/code&gt;, and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--trust&lt;/code&gt; to avoid interactive confirmation:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gemini mcp add intellij \
  --type http \
  http://host.docker.internal:17820 \
  --scope user \
  --trust
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;running-a-prompt-2&quot;&gt;Running a Prompt&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gemini \
  --screen-reader true \
  --sandbox-mode none \
  --approval-mode yolo \
  --output-format stream-json \
  --prompt &quot;List the MCP tools and call steroid_list_projects&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--approval-mode yolo&lt;/code&gt; disables tool-use confirmation prompts.
The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--screen-reader true&lt;/code&gt; flag suppresses ANSI terminal formatting.&lt;/p&gt;

&lt;p&gt;Note: newer Gemini CLI versions replaced &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--sandbox-mode none&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--sandbox false&lt;/code&gt;.
We handle this with a simple retry – if the first attempt fails with
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unknown arguments: sandbox-mode&lt;/code&gt;, we retry with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--sandbox false&lt;/code&gt;.
This keeps the tests passing across CLI version bumps without needing to pin an exact version.&lt;/p&gt;

&lt;h3 id=&quot;parsing-the-json-output-2&quot;&gt;Parsing the JSON Output&lt;/h3&gt;

&lt;p&gt;Gemini’s format uses typed events with field names that differ from Claude’s:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;message&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;role&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;assistant&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;content&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;I&apos;ll start by listing...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;delta&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tool_use&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tool_name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;steroid_list_projects&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tool_id&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tool-1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;parameters&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tool_result&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tool_id&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;tool-1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tool_name&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;steroid_list_projects&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;status&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;success&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;output&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;...&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;result&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;status&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;success&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;stats&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:{&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;input_tokens&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;800&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;output_tokens&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;400&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;tool_calls&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;duration_ms&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A few quirks worth knowing:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;message.content&lt;/code&gt; is a plain string, not an array – unlike Claude’s format&lt;/li&gt;
  &lt;li&gt;Tool calls use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tool_name&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;parameters&lt;/code&gt;, not &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;name&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;input&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;result&lt;/code&gt; stats have separate &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;input_tokens&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;output_tokens&lt;/code&gt;, not a combined total&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One more surprise: &lt;strong&gt;Gemini CLI occasionally exits with code 137 (SIGKILL) even after
successfully completing a request&lt;/strong&gt;. We detect this by checking the raw NDJSON for
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;status&quot;:&quot;success&quot;&lt;/code&gt; and treating exit code 137 as success when that signal is present.
Without this guard, valid runs fail spuriously about 5% of the time.&lt;/p&gt;

&lt;p&gt;The Gemini API also drops the socket mid-stream on transient errors
(&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;UND_ERR_SOCKET&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;terminated&lt;/code&gt;). We retry once when we see these patterns.
Transient infrastructure failures shouldn’t count as test failures.&lt;/p&gt;

&lt;h2 id=&quot;what-a-passing-test-looks-like&quot;&gt;What a Passing Test Looks Like&lt;/h2&gt;

&lt;p&gt;The test prompts the agent with explicit instructions and then asserts on the output:&lt;/p&gt;

&lt;div class=&quot;language-kotlin highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;testDiscoversSteroidTools&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;timeoutRunBlocking&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;300&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;val&lt;/span&gt; &lt;span class=&quot;py&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;session&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;runPrompt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
        List all MCP tools starting with &quot;steroid_&quot; and print each as:
        TOOL: &amp;lt;name&amp;gt; - &amp;lt;description&amp;gt;

        Then call steroid_list_projects exactly once and print the result as:
        PROJECTS: &amp;lt;raw JSON&amp;gt;
    &quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;assertExitCode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;prompt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;assertNoErrorsInOutput&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;message&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;prompt&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;nf&quot;&gt;assertTrue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;contains&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;PROJECTS:&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;assertTrue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;contains&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;project&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;assertTrue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;contains&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;project&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;basePath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;toString&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The structured output format (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TOOL:&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;PROJECTS:&lt;/code&gt;) makes assertions mechanical.
The test fails if the agent hallucinates a tool call, misreads the MCP response,
or doesn’t actually invoke the tool. That is precisely the contract we want to enforce.&lt;/p&gt;

&lt;p&gt;We run all three agents against the same test cases. This has surfaced real differences:
tools with ambiguous descriptions that Claude calls correctly but Gemini mis-calls,
JSON response shapes that Codex handles but another rejects, and timeout boundaries
that only appear under real container-to-host network latency.&lt;/p&gt;

&lt;h2 id=&quot;the-development-loop&quot;&gt;The Development Loop&lt;/h2&gt;

&lt;p&gt;Before we had these tests, adding a new MCP tool meant:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Implement the tool handler&lt;/li&gt;
  &lt;li&gt;Start IntelliJ manually, open Claude Code, hope the tool showed up&lt;/li&gt;
  &lt;li&gt;Try a prompt and see if it worked&lt;/li&gt;
  &lt;li&gt;Debug output format issues in the UI&lt;/li&gt;
  &lt;li&gt;Repeat – usually four or five times&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With integration tests, the loop is:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;Implement the tool handler&lt;/li&gt;
  &lt;li&gt;Run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew :ij-plugin:test --tests &quot;*CliClaudeIntegrationTest*&quot;&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Read the test output – the full NDJSON-filtered agent transcript is there&lt;/li&gt;
  &lt;li&gt;Fix what’s broken, re-run&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Faster. Reproducible. And it runs in CI without any human in the loop. It runs inside
Agentic loop too.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Having real end-to-end tests with real AI Agents is the fastest feedback loop
we have found for MCP server development. The 60 seconds it takes to spin up
a Docker container is cheaper than the 10 minutes of manual testing it replaces.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;container-lifecycle&quot;&gt;Container Lifecycle&lt;/h2&gt;

&lt;p&gt;One infrastructure detail worth mentioning: we use a lightweight reaper container
to handle cleanup. When the test JVM exits or gets killed with SIGKILL, we need
all agent containers cleaned up. The reaper registers containers via a simple
TCP socket protocol (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;container=&amp;lt;id&amp;gt;&lt;/code&gt; messages) and kills them all when the
connection drops.&lt;/p&gt;

&lt;p&gt;This means test runs don’t leave orphaned Docker containers even when the process
is forcibly terminated during development or in CI.&lt;/p&gt;

&lt;h2 id=&quot;what-this-enables&quot;&gt;What This Enables&lt;/h2&gt;

&lt;p&gt;This approach enables the &lt;strong&gt;Agentic Loop&lt;/strong&gt;, the way to make your coding
agents receive instant feedback from the tests. My AI Agents broke these
tests multiple times, and thanks to the tests, we were able to recover/rollaback,
without breaking the real production. Integration tests, and the testing itself,
is the vital building block, especially when AI Agents write the most of the code.&lt;/p&gt;

&lt;p&gt;The test harness is part of &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com&quot;&gt;mcp-steroid&lt;/a&gt;. If you want to build
your own MCP server for IntelliJ, the same infrastructure is available.
You can write a test that starts the real IDE, registers your server,
and runs a real AI Agent against it – all from a single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./gradlew test&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;We’ve used this to test over 20 distinct tools across three agent CLIs,
catching dozens of issues that would have been invisible in unit tests:
tools with ambiguous descriptions that agents mis-call, JSON response shapes
one CLI handles but another rejects, and timeout boundaries that only show up
under real network latency between host and container.&lt;/p&gt;

&lt;p&gt;While writing tests for every tool, we also noticed something else. Some tools were
so simple that the agents could have called them just as well through a shell command.
That observation turned into a broader question.&lt;/p&gt;

&lt;h2 id=&quot;mcp-is-one-way&quot;&gt;MCP Is One Way&lt;/h2&gt;

&lt;p&gt;MCP Server is not the only way to give AI Agents access to your tools.&lt;/p&gt;

&lt;p&gt;In a &lt;a href=&quot;/blog/2026/02/20/cli-tools-for-ai-agents/&quot;&gt;recent post about CLI tools for AI Agents&lt;/a&gt;, I wrote about how well-designed
command-line interfaces are often sufficient. Agents already know how to run shell commands,
parse text output, and chain calls. If your tool has a good CLI, you may not need
an MCP server at all.&lt;/p&gt;

&lt;p&gt;For &lt;a href=&quot;https://mcp-steroid.jonnyzzz.com&quot;&gt;mcp-steroid&lt;/a&gt;, MCP made sense: we need structured tool calls,
streaming data, and tight IDE integration that a CLI cannot provide. But for many tools –
build systems, package managers, code search, deployment pipelines – a CLI
is simpler to build, simpler to test, and works with every agent out of the box.&lt;/p&gt;

&lt;p&gt;The question to ask is not “should I build an MCP server?” but “what interface
does an AI Agent need to use this tool reliably?” Sometimes that is MCP.
Sometimes it is a shell command. And sometimes it is both.&lt;/p&gt;

&lt;p&gt;If you are building an MCP server and want to share how you tested it, I’d like to hear
about it – reach out on &lt;a href=&quot;https://www.linkedin.com/in/jonnyzzz/&quot;&gt;LinkedIn&lt;/a&gt; or &lt;a href=&quot;https://twitter.com/jonnyzzz&quot;&gt;Twitter&lt;/a&gt;.&lt;/p&gt;</content>

  
  
  
  
  

    <author>
      <name>Eugene Petrenko</name>
    </author>

  
    <category term="ai-agents" />
  
    <category term="ai-coding" />
  
    <category term="mcp" />
  
    <category term="mcp-steroid" />
  
    <category term="testing" />
  
    <category term="developer-experience" />
  
    <category term="docker" />
  
    <summary type="html">The only way to know your MCP server actually works is to run a real AI Agent against it. Here is how we built integration tests that spin up Claude Code, Codex, and Gemini CLI in Docker containers -- all talking to the same live MCP server during the test run.</summary>
  
  </entry>
  
</feed>
