<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[The Pragmatic Engineer]]></title><description><![CDATA[Observations across the software engineering industry.]]></description><link>https://blog.pragmaticengineer.com/</link><image><url>https://blog.pragmaticengineer.com/favicon.png</url><title>The Pragmatic Engineer</title><link>https://blog.pragmaticengineer.com/</link></image><generator>Ghost 6.30</generator><lastBuildDate>Tue, 14 Apr 2026 10:51:59 GMT</lastBuildDate><atom:link href="https://blog.pragmaticengineer.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[The Pulse: is GitHub still best for AI-native development?]]></title><description><![CDATA[Availability has dropped to one nine (~90% – !!), partly due to not being able to handle increased traffic from AI coding agents. There’s also no CEO and an apparent lack of direction.]]></description><link>https://blog.pragmaticengineer.com/the-pulse-is-github-still-best-for-ai-native-development/</link><guid isPermaLink="false">69cfc72da95ab10001e47e0c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 03 Apr 2026 15:03:38 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-github-still-best-for?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the article below eight days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em> subscribe here</em></a><em>.</em></p><p>We&#x2019;re used to highly reliable systems which target four-nines of availability (99.99%, meaning about 52 minutes of downtime per year), and for it to be embarrassing to barely hit three nines (around 9 hours of downtime per year.) And yet, in the past month, GitHub&#x2019;s reliability is down to one nine!</p><p>Here&#x2019;s data from the third-party, &#x201C;<a href="https://mrshu.github.io/github-statuses/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">missing GitHub status page</a>&#x201D;, which was built after GitHub stopped updating its own status page due to terrible availability. Recently, things have looked poor:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image.png" class="kg-image" alt loading="lazy" width="1456" height="399" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/04/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/04/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">GitHub down at one nine. Source: </em></i><a href="https://mrshu.github.io/github-statuses/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">The Missing GitHub Status Page</em></i></a></figcaption></figure><p>This means that for every 30 days, GitHub had issues on 3 days, or issues/degradations for 2.5 hours daily (around 10% of the time.)</p><p><strong>GitHub seems unable to keep up with the massive increase in infra load from agents. </strong>One software engineer built a clever website called &#x201C;<a href="https://www.claudescode.dev/?window=90d&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Claude&#x2019;s Code</a>&#x201D; that tracks Claude Code bot contributions across GitHub. Growth in the past three months has been enormous:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image-1.png" class="kg-image" alt loading="lazy" width="1456" height="909" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/04/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/04/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image-1.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Load from Claude Code has 6x&#x2019;d in 3 months. Source: </em></i><a href="https://www.claudescode.dev/?window=90d&amp;ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Claude&#x2019;s Code</em></i></a></figcaption></figure><h3 id="stream-of-github-outages-from-infra-overload">Stream of GitHub outages from infra overload</h3><p>GitHub&#x2019;s CTO, Vladimir Fedorov, addressed availability issues <a href="https://github.blog/news-insights/company-news/addressing-githubs-recent-availability-issues-2/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in a blog post</a> and covered three major incidents:</p><ul><li><a href="https://www.githubstatus.com/incidents/xwn6hjps36ty?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">2 February</a>: security policies unintentionally blocked access to virtual machine metadata</li><li><a href="https://www.githubstatus.com/incidents/lcw3tg2f6zsd?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">9 February</a>: a database cluster got overloaded</li><li><a href="https://www.githubstatus.com/incidents/g5gnt5l5hf56?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">5 March</a>: writes failed on a Redis cluster</li></ul><p>Software engineer Lori Hochstein did <a href="https://surfingcomplexity.blog/2026/03/12/quick-thoughts-on-github-ctos-post-on-availability/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a helpful analysis</a> of these outages and the CTO&#x2019;s response, and has interesting observations:</p><ul><li><strong>Saturation</strong>: the database cluster incident (9 Feb) was a case of the database getting saturated, due to higher-than-expected usage. Databases are harder to scale up than stateless services. GitHub also underestimated how much additional traffic there would be.</li><li><strong>Failover + telemetry gap</strong>: the 2 Feb incident was a combination of an infra issue in one region failing over to a healthy region, and making things worse with a telemetry gap (incorrect security policies were applied in the new regions which blocked access to VM metadata)</li><li><strong>Failover + configuration issue</strong>: the 5 March incident was uncannily similar: after a failover, a configuration issue blocked writes on a Redis cluster</li></ul><p>It is certainly nice to get details from GitHub on these outages. It feels to me that infra strains are causing more infra issues &#x2192; they trigger constraints faster &#x2192; failovers are not as smooth as they should be. Could it be because GitHub keeps changing their existing systems?</p><h3 id="startup-shows-github-how-it%E2%80%99s-done">Startup shows GitHub how it&#x2019;s done</h3><p>While GitHub struggles to keep up with the increase in load from AI agents generating more code and pull requests, a new startup called Pierre Computer claims to have built an &#x201C;AI-native&#x201D; solution for AI agents pushing code, which scales far beyond what GitHub can do. Pierre was founded by <a href="https://www.linkedin.com/in/jacob-thornton-13a6a5162/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Jacob Thornton</a>: formerly an engineer at Coinbase, Medium, and Twitter, and also the creator of the once-very popular <a href="https://getbootstrap.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Bootstrap</a> CSS library.</p><p>Here&#x2019;s what Pierre supports, which GitHub does not:</p><blockquote>&#x201C;In October [2025], Github shared they were averaging ~230 new repos per minute.<br><br>Last week we [at Pierre Computer] hit a sustained peak of &gt; 15,000 repos per minute for 3 hours.<br><br>And in the last 30 days customers have created &gt; 9M repos&#x201D;</blockquote><p>These are incredible numbers &#x2013; if also self-reported &#x2013; and something that GitHub clearly cannot get close to, at least not today! There are few details about customers, while the product &#x2013; called <a href="https://code.storage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Code.storage</a> &#x2013; seems to be in closed beta.</p><p>Still, this is the type of &#x201C;git for AI agents&#x201D; that GitHub has failed to build, and the type of infrastructure it needs badly.</p><h3 id="has-github-lost-focus-and-purpose">Has GitHub lost focus and purpose?</h3><p>GitHub&#x2019;s reliability issues are acute enough that, if it keeps up, teams will start giving alternatives like small startups such as Pierre a try, or perhaps even consider self-hosting Git. But how did the largest Git host in the world neglect its customers, and fail to prepare its infra for an increase in code commits and pull requests?</p><p>Mitchell Hashimoto, founder of Ghostty, and a heavy user of GitHub himself, had advice on what he would do if he was in charge of GitHub, after growing frustrated with the state of its core offering. He <a href="https://x.com/mitchellh/status/2036866220449030168?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">writes</a> (emphasis mine)</p><blockquote>&#x201C;Here&#x2019;s what I&#x2019;d do if I was in charge of GitHub, in order:<br><br><strong>1. Establish a North Star plan around being critical infrastructure for agentic code</strong> lifecycles and determine a set of ways to measure that.<br><br><strong>2. Fire everyone who works on or advocates for Copilot and shut it down.</strong> It&#x2019;s not about the people, I&#x2019;m sure there&#x2019;s many talented people; you&#x2019;re just working at the wrong company.<br><br><strong>3. Buy Pierre and launch agentic repo hosting as the first agentic product.</strong> Repos would be separate from the legacy web product to start, since they&#x2019;re likely burdened with legacy cross product interactions.<br><br><strong>4. Re-evaluate all product lines and initiatives against the new North Star. </strong>I suspect 50% get cut (to make room for different ones).<br><br>The big idea is all agentic interactions should critically rely on GitHub APIs. Code review should be agentic but the labs should be building that into GH (not bolted in through GHA like today, real first class platform primitives). GH should absolutely launch an agent chat primitive, agent mailboxes are obviously good. GH should be a platform and not an agent itself.<br><br>This is going to be very obviously lacking since I only have external ideas to work off of and have no idea how GitHub internals are working, what their KPIs are or what North Star they define, etc.<br><br>But, with imperfect information, this is what I&#x2019;d do.&#x201D;</blockquote><p>My sense is that GitHub has three concurrent problems:</p><ul><li><strong>GitHub and Copilot are entangled with Microsoft&#x2019;s internal politics. </strong>GitHub&#x2019;s Copilot in 2021 was the first massively successful &#x201C;AI product.&#x201D; Microsoft took the &#x201C;Copilot&#x201D; brand and used it across all of their product lines, creating low-quality AI integrations. Simultaneously, internal Microsoft orgs like Azure and Microsoft AI were trying to get their hands on GitHub, which is one of the most positive developer brands at Microsoft.</li><li><strong>GitHub has no leader, seemingly by design. </strong>GitHub&#x2019;s last CEO was Thomas Dohmke, who stepped down voluntarily, and Microsoft never backfilled the CEO role; instead carrying out a reorg to make GitHub part of Microsoft&#x2019;s AI group and stripping its independence. It seems the &#x201C;Microsoft AI&#x201D; side won that battle.</li><li><strong>GitHub has no focus, and is stuck chasing Copilot as a revenue source. </strong>GitHub has no CEO and is caught up in internal politics, so, what can GitHub teams do? The safest bet is to increase revenue and the best way to do that is by investing more into GitHub Copilot, and ignoring long-term issues like reliability.</li></ul><p>I agree with Mitchell: GitHub has no &#x201C;North Star&#x201D; and we see a large org being dysfunctional. That lack of vision &#x2013; and CEO &#x2013; is hitting hard:</p><ul><li>GitHub Copilot went from the most-used AI agent in 2021, to be overtaken by Claude Code, and is soon to be overtaken by Cursor.</li><li>As a platform, GitHub has no vision for how to evolve to support AI agents. Sure, GitHub has an MCP server, but it has no &#x201C;AI-native git platform&#x201D; that can handle the massive load AI agents generate.</li><li>GitHub keeps shipping small features and improvements without direction. For example, in October 2025, they <a href="https://x.com/jaredpalmer/status/1980619222918262842?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">started to work on</a> stacked diffs. However, when it ships, the stacked diffs workflow might be mostly obsolete &#x2013; at least with AI agents!</li></ul><p>It&#x2019;s easy to win a market when you do one thing better than anyone else in the world. Right now, GitHub is doing too many things and doing a subpar job with Copilot, its platform, and AI infra.</p><hr><p>Read the full issue of <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-github-still-best-for?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">last week&#x2019;s The Pulse</a>, or check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-industry-leaders-return?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">this week&#x2019;s The Pulse</a>.</p><p>Catch up with recent The Pragmatic Engineer issues:</p><ul><li><a href="https://newsletter.pragmaticengineer.com/p/scaling-uber-with-thuan-pham-ubers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Scaling Uber with Thuan Pham</strong></a> (Uber&#x2019;s first CTO &#x2014; podcast). We went into topics like scaling Uber from constant outages to global infrastructure, the shift to microservices and platform teams, and how AI is reshaping engineering.</li><li><a href="https://newsletter.pragmaticengineer.com/p/building-whatsapp-with-jean-lee?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Building WhatsApp with Jean Lee</strong></a> (podcast): Jean Lee, engineer #19 at WhatsApp, on scaling the app with a tiny team, the Facebook acquisition, and what it reveals about the future of engineering.</li><li><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-what-will-the-staff-engineer?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>What will the Staff Engineer role look like in 2027 and beyond</strong></a><strong>?</strong> What happens to the Staff engineer role when agents write more code? Actually, they could be more in demand than ever!</li></ul>]]></content:encoded></item><item><title><![CDATA[Is the FDE role becoming less desirable?]]></title><description><![CDATA[Job postings for Forward Deployed Engineers (FDEs) have surged, but many professionals don’t want the role because it’s more like solutions engineering than software development.]]></description><link>https://blog.pragmaticengineer.com/is-the-fde-role-becoming-less-desirable/</link><guid isPermaLink="false">69c5918c3f13830001776a97</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 27 Mar 2026 10:29:33 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-the-fde-role-becoming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the article below seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em> subscribe here</em></a><em>.</em></p><p>An interesting trend highlighted <a href="https://www.wsj.com/cio-journal/the-hottest-job-in-tech-isnt-very-glamorous-dc29ab3e?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">by The Wall Street Journal</a>: companies want to hire for FDE roles, but devs are just not that interested:</p><blockquote>&#x201C;Job postings on Indeed grew more than 10-fold in 2025 compared with 2024. The number of public company transcripts mentioning the role jumped to 50 from eight over the same period, according to data from AlphaSense.<br><br>The only problem? Few engineers want the job, which has historically been seen as demanding, undesirable, and less prestigious than product-focused engineering roles.<br><br>&#x201C;Everyone wants them and there&#x2019;s only maybe 10% of the market that wants that role,&#x201D; said Patrick Kellenberger, president and chief operating officer at Betts Recruiting.&#x201C;</blockquote><p>Last summer, we covered <a href="https://newsletter.pragmaticengineer.com/p/forward-deployed-engineers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the rise of the FDE role</a>, and looked into what it&#x2019;s like. Back then, this is how I visualized what was then a very hot role:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-3.png" class="kg-image" alt loading="lazy" width="1280" height="798" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-3.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-3.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-3.png 1280w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">My 2025 visualization of the FDE role</em></i></figcaption></figure><p>At the companies where I interviewed FDE folks &#x2013; OpenAI and Ramp &#x2013; the role seemed to live up to this visualization. However, I&#x2019;ve since talked with two engineers who took FDE roles and were disappointed. This is how they saw it, in practice:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-4.png" class="kg-image" alt loading="lazy" width="1400" height="1094" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-4.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-4.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-4.png 1400w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Reality of the FDE role: less software engineering, and even less platform engineering</em></i></figcaption></figure><p>The role seems akin to a &#x201C;sales engineer&#x201D; where FDEs help close the deals, or a solutions engineer (or even consultant), where FDEs deploy to a customer to build them a solution. They don&#x2019;t contribute back into the platform, and don&#x2019;t do much that&#x2019;s considered &#x201C;software engineering&#x201D; beyond integrating software which the product team built.</p><p>Some engineers figure out the nature of the role during the interview process and pass on it. Meanwhile, some others take the job and later quit. Here&#x2019;s what a dev told me who accept an FDE role at a company, but didn&#x2019;t find what they expected:</p><blockquote>&#x201C;This FDE job was a typical IT services mindset. The company wanted to use me more on the engagement lead side, and nothing on software development. It&#x2019;s not what I signed up for, and I didn&#x2019;t like the vibe and culture. I quit 4 weeks later.&#x201D;</blockquote><p>In today&#x2019;s job market, if there&#x2019;s high demand for a role which pays decently but attracts little interest from engineers, there&#x2019;s always a reason!</p><hr><p>Read the full issue of <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-the-fde-role-becoming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">last week&#x2019;s The Pulse</a>, or check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-github-still-best-for?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">this week&#x2019;s The Pulse</a>.</p><p>Catch up with recent The Pragmatic Engineer issues:</p><ul><li><a href="https://newsletter.pragmaticengineer.com/p/building-whatsapp-with-jean-lee?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Building WhatsApp with Jean Lee</strong></a> (podcast): Jean Lee, engineer #19 at WhatsApp, on scaling the app with a tiny team, the Facebook acquisition, and what it reveals about the future of engineering.</li><li><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-what-will-the-staff-engineer?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>The Pulse: What will the Staff Engineer role look like in 2027 and beyond?</strong></a><strong> </strong>What happens to the Staff engineer role when agents write more code? Actually, they could be more in demand than ever!</li><li><a href="https://newsletter.pragmaticengineer.com/p/from-ides-to-ai-agents-with-steve?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>From IDEs to AI Agents with Steve Yegge (podcast):</strong></a> Steve Yegge on how AI is reshaping software engineering, the rise of &#x201C;vibe coding,&#x201D; and why developers must adapt to a rapidly changing craft.</li></ul>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare rewrites Next.js as AI rewrites commercial open source]]></title><description><![CDATA[An engineer at Cloudflare rewrote most of Vercel’s Next.js in one week with AI agents. It looks like a sign of how AI will disrupt existing moats and business models. Analysis]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflare-rewrites-next-js-as-ai-rewrites-commercial-open-source/</link><guid isPermaLink="false">69a9c3bb4c4eb80001b25ced</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 05 Mar 2026 18:03:16 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>Pragmatic Engineer</em></a><em>. This issue is the </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-164-nextjs?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>entire The Pulse issue</em></a><em> from the past week, which paying subscribers received seven days ago. This piece generated </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-164-nextjs/comments?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>quite a few comments across subscribers</em></a><em>, and so I&apos;m sharing it more broadly, especially as it raises questions on what is defensible and what is not with open source.</em></p><p><em>If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em> <u>subscribe here</u></em></a><em> to get issues like this in your inbox.</em></p><p>Today&#x2019;s issue of The Pulse focuses on a single event because it&#x2019;s a significant one with major potential ripple effects. On Tuesday, Cloudflare shocked the dev world by announcing that they have rewritten&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>&#xA0;in just one week, with a single developer who used only $1,100 in tokens:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image.png" class="kg-image" alt loading="lazy" width="1186" height="1342" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image.png 1186w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare CTO Dane Knecht&#xA0;</em></i><a href="https://x.com/dok2001/status/2026386974580330830?s=20&amp;ref=blog.pragmaticengineer.com" rel><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>There are several layers to dig into here:</p><ol><li><strong>The Next.js ecosystem: a recap</strong>. Close to half of React devs use Next.js, and the best place to deploy Next.js is on Vercel &#x2013; partly thanks to its proprietary build output.</li><li><strong>What Cloudflare did with Next.js</strong>. Replacing the build engine in Next.js with the more standard Vite one, allowing Next.js apps to be easily deployed on Cloudflare.</li><li><strong>AI brings the impossible within reach</strong>. What would take years in engineering terms was executed in one week with some tokens.</li><li><strong>&#x201C;AI slop&#x201D; still an issue.</strong>&#xA0;Contrary to Cloudflare&#x2019;s claims, vinext is not production-ready, and will need plenty of cleanup and auditing to make it on par with Next.js.</li></ol><h2 id="1-the-nextjs-ecosystem-a-recap"><br>1. The Next.js ecosystem: a recap</h2><p>First, some background.&#xA0;<a href="https://nextjs.org/?ref=blog.pragmaticengineer.com">Next.js</a>&#xA0;is the most popular fullstack React framework and around half of all React devs use it, as per recent research such as the 2025 Stack Overflow developer survey. Next.js is an open source project, built and mostly maintained by Vercel, which is the preferred deployment target for Next.js applications for many reasons. One of them is that Next.js is ideal to deploy to Vercel because Next.js applications are built with Vercel&#x2019;s Turbopack build tool. The output of a build is a proprietary format. As Netlify engineer Eduardo Bou&#xE7;as&#xA0;<a href="https://eduardoboucas.com/posts/2025-03-25-you-should-know-this-before-choosing-nextjs/?ref=blog.pragmaticengineer.com">writes</a>:</p><blockquote>&#x201C;The output of a Next.js build has a proprietary and undocumented format that is used in Vercel deployments to provision the infrastructure needed to power the application.<br><br>This means that any hosting providers other than Vercel must build on top of undocumented APIs that can introduce unannounced breaking changes in minor or patch releases. (And they have)&#x201D;.</blockquote><p>Next.js is an interestingly built project, where everything is open source, and the best place to deploy a Next.js application is on Vercel, as it&#x2019;s optimized to run undocumented build artifacts the most efficiently. This is a smart strategy from Vercel which competitors will dislike, as any hosting provider would prefer Next.js to produce a standard build format. To do this, the build engine, Turbopack, would need to be replaced with something more standard.</p><p><strong>Let&#x2019;s talk about build tools for web development.&#xA0;</strong>According to the&#xA0;<a href="https://2025.stateofjs.com/en-US/libraries/?ref=blog.pragmaticengineer.com">State of JS 2025 survey</a>, the most popular in the web ecosystem are:</p><ol><li><a href="https://vite.dev/?ref=blog.pragmaticengineer.com"><strong>Vite</strong></a>: the most popular choice for new projects due to its speed and developer experience. Uses projects like&#xA0;<a href="https://esbuild.github.io/?ref=blog.pragmaticengineer.com">esbuild</a>&#xA0;and&#xA0;<a href="https://rollupjs.org/?ref=blog.pragmaticengineer.com">Rollup</a>&#xA0;under the hood</li><li><a href="https://webpack.js.org/?ref=blog.pragmaticengineer.com"><strong>Webpack</strong></a>: a legacy tool that&#x2019;s not very performant, but still widely deployed in older projects</li><li><a href="https://nextjs.org/docs/app/api-reference/turbopack?ref=blog.pragmaticengineer.com"><strong>Turbopack</strong></a>: Created by Vercel and optimized for larger&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>&#xA0;applications. Built in Rust and intended to be more performant</li><li><a href="https://bun.com/?ref=blog.pragmaticengineer.com"><strong>Bun</strong></a>: a relatively new, all-in-one runtime and bundler. Anthropic acquired the team&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/180722007/anthropic-acquires-javascript-runtime-bun?ref=blog.pragmaticengineer.com">in December</a>, and some Bun folks are now focused on improving Claude Code&#x2019;s performance.</li></ol><p>So, most of the web ecosystem uses Vite as a build tool; Next.js uses Turbopack, and the majority of React applications with a full-stack React framework use Next.js. Basically, most devs using Next.js are likely to use Vite as their build tool.</p><h2 id="2-what-cloudflare-did-with-nextjs"><br>2. What Cloudflare did with Next.js</h2><p>Here&#x2019;s a naive idea: what if Next.js used Vite to generate build outputs? In that case, build outputs would be standardized and would run equally well on any cloud provider, as there would be nothing proprietary or undocumented to Vercel.</p><p>And this is what Cloudflare did: replace Turbopack with Vite and call the new package &#x2018;vinext&#x2019;:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-1.png" class="kg-image" alt loading="lazy" width="1442" height="1024" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-1.png 1442w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare replaced the Turbopack build dependency with Vite to create vinext</em></i></figcaption></figure><p>Buried midway in the announcement is how this project&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com#status-experimental">is experimental</a>&#xA0;and not at all guaranteed to work okay: it&#x2019;s a &#x2018;use-at-own-risk&#x2019; project. Still, the mere fact of this development feels like an earthquake in the tech world because of&#xA0;<em>how</em>&#xA0;it was pulled off.</p><h2 id="3-ai-brings-the-impossible-within-reach"><br>3. AI brings the impossible within reach</h2><p>In a blog post announcing the project, Cloudflare claims only one engineer &#x201C;rebuilt&#x201D; the whole thing in a way that&#x2019;s trivial to deploy to Cloudflare&#x2019;s own infrastructure, and only cost $1,100 in tokens. From Cloudflare&#x2019;s&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">statement</a>:</p><blockquote>&#x201C;Last week, one engineer and an AI model rebuilt the most popular front-end framework from scratch. The result, vinext (pronounced &#x201C;vee-next&#x201D;), is a drop-in replacement for Next.js, built on Vite, that deploys to Cloudflare Workers with a single command. In early benchmarks, it builds production apps up to 4x faster and produces client bundles up to 57% smaller. And we already have customers running it in production.<br><br>The whole thing cost about $1,100 in tokens&#x201D;.</blockquote><p>What Cloudflare did:</p><ul><li>Took the Next.js public API</li><li>Reimplemented behaviour using Vite</li><li>Created build output whose behaviour matches the &#x201C;original&#x201D; Next.js implementation</li></ul><p>After 10 years, the core of Next has around 194,000 lines of code (LOC)**. Meanwhile,&#xA0;<a href="https://github.com/cloudflare/vinext?ref=blog.pragmaticengineer.com">vinext</a>&#xA0;is about 67,000 lines of code which suggests a much leaner implementation: for example, vinext does not need to support legacy Next APIs, and vinext currently supports 94% of the Next.js API (and it&#x2019;s safe to assume they left complex edge cases in the remaining 6%).<br><br>** the Next.js repository is closer to 2M lines of code: 1M is bundled dependencies (eg React bundles, CSS build etc), tests are 308,000 LOC, Turbopack 311,000 LOC.</p><p><strong>Pre-AI, this reimplementation would have taken years of engineering time to complete.&#xA0;</strong>Doing what Cloudflare did was always possible<em>&#xA0;in theory</em>, but never seemed practical. I mean, why have a team of engineers spend potentially years on generating a standardized build output for Next.js apps? Even if they did, the dev community would have doubts about whether Cloudflare would maintain the project.</p><p>This is the thing with forking or rewriting open source projects: a major value proposition for commercial open source is to know that they will be&#xA0;<em>maintained</em>. Vercel has proved it&#x2019;s a reliable custodian of Next.js for the past 10 years. Without AI, it could be assumed that any new reimplementation would eventually run out of steam.</p><p><strong>Separately but relatedly, Cloudflare has now proved that the cost of rewriting&#xA0;<em>existing</em>&#xA0;software has become ~100x cheaper, thanks to AI, and this economy is likely to be the case for maintenance, too.&#xA0;</strong>Considering how trivial it was to rebuild one of the more complex open source projects, this augers well for it being trivial and much cheaper to maintain in the future. Potentially, Cloudflare no longer needs to budget an engineering team only for maintenance, if a single engineer could maintain the project, part-time!</p><p>Cloudflare had a project measured in engineering years, and completed it in&#xA0;<em>one engineering week</em>! It just took a single engineer using&#xA0;<a href="https://opencode.ai/?ref=blog.pragmaticengineer.com">OpenCode</a>&#xA0;(open source coding agent), Opus 4.5, and a bunch of tokens, then: &#x2018;<em>boom&#x2019;</em>,&#xA0;<em>vinext</em>&#xA0;was born.</p><h2 id="4-%E2%80%9Cai-slop%E2%80%9D-still-an-issue">4. &#x201C;AI slop&#x201D; still an issue</h2><p>There are questions about the quality of vinext, though.<strong>&#xA0;</strong>Vercel, naturally, is unhappy and hit out at the obvious weakness that vinext is unfit for production usage because it&#x2019;s insecure. Vercel CEO, Guillermo Rauch, did not miss a beat by tying Cloudflare&#x2019;s effort to the &#x201C;vibe coding&#x201D; stereotype of sloppy work executed with a lack of understanding:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-2.png" class="kg-image" alt loading="lazy" width="1194" height="794" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-2.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-2.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-2.png 1194w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Guillermo Rauch&#xA0;</em></i><a href="https://x.com/rauchg/status/2026864132423823499?s=20&amp;ref=blog.pragmaticengineer.com" rel><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>Guillermo has a point: anyone who stopped reading&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">Cloudflare&#x2019;s launch announcement</a>&#xA0;after the first few sentences would assume it&#x2019;s production-ready, with the first paragraph of this announcement closing with:</p><p>&#x201C;And we already have customers running it in production.&#x201D;</p><p>However, Cloudflare doesn&#x2019;t&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com#status-experimental">share</a>&#xA0;the rather crucial detail that &#x201C;running in production&#x201D; means that vinext has been deployed onto a beta site, until more than 1,000 words (around 2&#x2013;3 pages) into the announcement:</p><blockquote>&#x201C;We want to be clear: vinext is experimental. It&#x2019;s not even one week old, and it has not yet been battle-tested with any meaningful traffic at scale. (...)<br><br>We&#x2019;ve been working with National Design Studio, a team that&#x2019;s aiming to modernize every government interface,&#xA0;<strong>on one of their beta sites</strong>, CIO.gov.</blockquote><p>Oh. So, &#x201C;customers running it in production&#x201D; at Cloudflare apparently means &#x201C;customer running a beta site in production without meaningful traffic.&#x201D; This is a first from the infrastructure giant, which usually prides itself on accurate statements!</p><p>This detail was also absent when Cloudflare&#x2019;s CEO and CTO&#xA0;<a href="https://x.com/eastdakota/status/2026389179345916255?s=20&amp;ref=blog.pragmaticengineer.com">were boosting</a>&#xA0;vinext like it was a mature, battle-tested product. In that context, Vercel&#x2019;s raising of the issue of security vulnerabilities is more than fair game, in my view.</p><p>Still, all that doesn&#x2019;t alter the core learning from this project: that AI has the power to drastically reduce engineering time by up to ~100x and deliver&#xA0;<em>usable-enough</em>&#xA0;output, for relatively negligible financial cost.&#xA0;<em>Just keep in mind that security and reliability issues will probably take plenty of extra time and effort to address.</em></p><h2 id="5-new-attack-vector-on-commercial-open-source">5. New attack vector on commercial open source?</h2><p>If arch-rivalries exist in tech, then Cloudflare and Vercel are a prime example. Both are gunning to become the most popular platform for developers to deploy their code, and the CEOs are regularly seen in public taking shots at the other side. One such spat happened&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/160004343/ceos-scrap?ref=blog.pragmaticengineer.com">in March</a>, as covered at the time:</p><blockquote>&#x201C;Things kicked off on social media, with developers confused about the severity of the incident, and about why Next.js seemed silent, and also why Cloudflare sites were breaking due to its fix for the CVE causing its own issues. It was at that point that Cloudflare&#x2019;s CEO, Matthew Prince, entered the chat to accuse Vercel of&#xA0;<a href="https://x.com/rauchg/status/1903590962498326771?ref=blog.pragmaticengineer.com">not caring about security</a>:<br><br>Given the security incident was ongoing, this felt a bit &#x201C;below the belt&#x201D; by the Cloudflare chief. Criticizing rivals is fair game, but why not wait until the incident is over? The punch landed, and Vercel&#x2019;s CEO Guillermo Rauch is not someone to take it lying down, so he&#xA0;<a href="https://x.com/rauchg/status/1903590962498326771?ref=blog.pragmaticengineer.com">hit back</a>.<br><br>Cloudflare&#x2019;s CEO then responded with a cartoon&#xA0;<a href="https://x.com/eastdakota/status/1903690805576909227?ref=blog.pragmaticengineer.com">implying</a>&#xA0;that although Vercel is much larger than its competitor Netlify, Cloudflare is 100x bigger than both, and could stomp them into the ground at will.&#x201D;</blockquote><p>Serving the public interest wasn&#x2019;t why Cloudflare rewrote&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>: they did it because they want Next.js sites to be deployed onto Cloudflare, but doing so made little sense until now because Next.js produced bespoke build output optimized for Vercel&#x2019;s infrastructure. With this change, Cloudflare&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">claims</a>&#xA0;it provides&#xA0;<em>superior&#xA0;</em>performance when hosting Next.js apps, according to their own measurements.</p><p><em>I&#x2019;d just add that performance is important for developers, but other things matter, too. Cost, reliability, developer experience, and how much devs like a company, are all factors in choosing between vendors. Also, performance measurements from a vendor about its own service must be taken with a large pinch of salt.</em></p><p><strong>Zooming out from this episode, it seems that AI is bringing the value of existing commercial open source moats into question.&#xA0;</strong>Vercel carved out a clever open source strategy that helped turn its open source investment into business revenue:</p><ol><li>Build and maintain Next.js, delivering the best developer experience (DX).</li><li>Optimize Vercel to serve the specific (and undocumented) build output of Next.js.</li><li>Most developers onboarding to Next.js will decide to deploy on Vercel to get the most benefit, in terms of DX and performance.</li><li>&#x2026; repeat for years while the business becomes worth billions! (Vercel was&#xA0;<a href="https://startupwired.com/2025/10/01/vercel-raises-300-million-reaches-9-3-billion-valuation/?ref=blog.pragmaticengineer.com">valued</a>&#xA0;at $9B last October).</li></ol><p>Underpinning this success are some assumptions:</p><ol><li>Next.js will remain the #1 choice for developers to build React applications, thanks to ongoing investment.</li><li>It is expensive to rewrite Next.js to be deployable and performant on another cloud vendor.</li><li>Even if someone did #2, developers would be skeptical and not switch over.</li></ol><p>Vercel can invest in #1 to keep Next as best-in-class, while knowing that the risk of #2 occurring is minor. However, Cloudflare has now &#x201C;cloned&#x201D; Next, and can easily keep up with all changes in the future, and port them back to vinext.</p><p><strong>But AI makes it trivial to &#x201C;piggyback&#x201D; off any commercial open source project, which is a massive problem for commercial open source startups.&#xA0;</strong>It puts all the effort and investment into building and maintaining&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>, while Cloudflare enjoys the benefit of this hard work (the Next.js public API) which is easily deployable to Cloudflare, and it can now undercut Vercel on price. For all future Next.js changes, Cloudflare will just sync it to vinext, using AI!</p><p>WordPress had&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/149770356/2-open-source-business-model-struggles-wordpress?ref=blog.pragmaticengineer.com">a similar problem</a>, with WP Engine &#x201C;piggybacking&#x201D; off its work and undercutting their pricing in 2024. As I analyzed at the time:</p><blockquote>&#x201C;Free-riding on permissive open source is too tempting to pass on for other vendors. WP Engine uses a common loophole of contributing almost nothing in R&amp;D to WordPress, while selling it as a managed service. This means that they could either easily undercut the pricing of larger players like Automattic which do spend on WordPress&#x2019;s R&amp;D. Alternatively, a company like WP Engine could charge as much, or more, as Automattic, but be able to spend a lot more on marketing, while being similarly profitable. &#x201C;Saving&#x201D; on R&amp;D gives the &#x201C;free-riders&#x201D; plenty of options to grow their businesses: options not necessarily open to Automattic while they invest as much into R&amp;D as they do.<br><br>Commercial open source vendors pressure to end &#x201C;freeriding&#x201D;. Automattic is likely facing lower revenue growth, with customers choosing vendors like WP Engine which offer a similar service &#x2014; getting these customers either via a cheaper price or thanks to more marketing spend. This legal fight could be an effort to force WP Engine to stop eating Automattic&#x2019;s lunch, or perhaps get WP Engine to sell to Automattic, which would cement its leading status in managed Wordpress, while also boosting revenue by $400M a year &#x2013; according to its own figures&#x201D;.</blockquote><p>Vercel managed to avoid the &#x201C;free-riding&#x201D; problem with&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>, but that&#x2019;s no longer possible now that AI makes it trivial to rewrite.</p><h2 id="6-defense-or-offense"><br>6. Defense or offense?</h2><p>How should commercial open source companies respond to the threat that a competitor can easily rewrite the software behind the managed solutions which they sell as services?</p><p><strong>One obvious response is to make tests private, so that replication is harder for AI.&#xA0;</strong>One thing that made it so easy for Cloudflare to rewrite Next was the project&#x2019;s comprehensive test suite. From&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">their announcement<u>&#xA0;</u></a>(emphasis mine):</p><blockquote>&#x201C;We also want to acknowledge the Next.js team. They&#x2019;ve spent years building a framework that raised the bar for what React development could look like.&#xA0;<strong>The fact that their</strong>&#xA0;API surface is so well-documented and their&#xA0;<strong>test suite so comprehensive</strong>&#xA0;is a big part of what made this project possible.&#x201D;</blockquote><p>Database solution SQLite is famous for its incredible test suite. What some people don&#x2019;t know is that while core&#xA0;<a href="https://sqlite.org/?ref=blog.pragmaticengineer.com">SQLite</a>&#xA0;tests are open source, its most comprehensive test suite &#x2013;&#xA0;<a href="https://sqlite.org/testing.html?ref=blog.pragmaticengineer.com">TH3</a>&#xA0;&#x2013; is closed source. SQLite monetizes its advanced infrastructure as a&#xA0;<a href="https://sqlite.org/prosupport.html?ref=blog.pragmaticengineer.com">service</a>&#xA0;for purchase. This is a fair tradeoff: for most contributors, the basic open source tests work well enough. For enterprise users or customers who really care about correctness, it makes sense to purchase advanced testing services from the service&#x2019;s creator.</p><p>Open source canvas project, tldraw,&#xA0;<a href="https://github.com/tldraw/tldraw/issues/8082?ref=blog.pragmaticengineer.com">announced</a>&#xA0;it will relocate its test suite to a closed source repository; a move which makes plenty of sense. Here&#x2019;s commentary from Simon Willison:</p><blockquote>&#x201C;It&#x2019;s become very apparent over the past few months that a comprehensive test suite is enough to build a completely fresh implementation of any open source library from scratch, potentially in a different language.&#x201D;</blockquote><p>In the event, tldraw&#x2019;s announcement turned out&#xA0;<a href="https://github.com/tldraw/tldraw/issues/8082?ref=blog.pragmaticengineer.com#issuecomment-3964650501">to be a joke</a>, but who&#x2019;s laughing now? An open source project with excellent tests is an easy target for an AI agent to execute a full rewrite of it.</p><p><strong>Could new licenses be created for the AI era?&#xA0;</strong>Existing open source licenses were created on the assumption that humans read open source code, and humans modify it. Agents break that assumption.</p><p>Could we see new license types emerge to ban AI agents from modifying projects&#x2019; source code? It seems pretty far-fetched and hard to implement, but not beyond the realms of possibility.</p><p>AI agents are still very new, and going mainstream in tech. Once they break into other industries, I wouldn&#x2019;t be surprised if legal frameworks are reworded to also apply to AI agents. If and when this happens, it would open the path for open source licenses to distinguish between agents and humans.</p><p><strong>What is a moat, if code can be trivially ported?&#xA0;</strong>A team operating a popular open source project can no longer assume it&#x2019;s expensive to fork or to be completely rewritten, meaning it makes sense to focus on other moats, such as:</p><ul><li><strong>Outstanding (paid) support.</strong>&#xA0;AI could make this much easier at a higher quality, if done right.</li><li><strong>Smaller open core, larger closed source part.&#xA0;</strong>&#x201C;Open core&#x201D; as a business model has been dominant for commercial open source: keep the core of the software open source, while advanced enterprise features are source available or closed source. I would expect more companies to move their additional services to closed source, not source available.</li><li><strong>In-person connection and community.</strong>&#xA0;Projects with a real-world community will form a sense of connection that goes beyond code. For example, it&#x2019;s hard to imagine vinext meetups popping up &#x2013; whereas there are many Next.js communities.</li><li><strong>Infrastructure and hardware remains a massive moat.&#xA0;</strong>In a world where software is trivial to copy, infrastructure remains a moat. Commercial open source might make most sense for players that own and operate superior infrastructure layers than their rivals: and being able to offer lower cost, higher reliability, lower latency, higher performance, or a combination of these.</li></ul><h2 id="7-ai-world-reality"><br>7. AI-world reality</h2><p><strong>One of the single best AI use cases is full-on rewrites of well-tested products.&#xA0;</strong>I estimate that AI sped up the creation of vinext by at least 100x, which is massive. But we don&#x2019;t really see efficiency boosts of anything like that with AI tools, in general. As Laura Tacho&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/189035949/1-data-vs-hype-how-orgs-actually-win-with-ai?ref=blog.pragmaticengineer.com">shared</a>&#xA0;at The Pragmatic Summit in San Francisco, the average self-reported efficiency &#x2018;AI gain&#x2019; seems to be circa 10%.</p><p>I suspect this vast chasm in efficiency boosts is because AI is many times more efficient at &#x201C;no-brainer tasks&#x201D; where correctness can be verified with tests, versus those which are more open ended or involve more creativity.</p><p><strong>In general, tests are incredibly important for efficient AI usage.&#xA0;</strong>On The Pragmatic Engineer Podcast, Peter Steinberger stressed how important &#x201C;closing the loop&#x201D; in his developer flow is by instructing the AI to test itself, and ensuring the AI has tests to run that verify correctness.</p><p>Automated tests were always considered a best practice for creating maintainable code. Now, having a codebase with extensive tests is the baseline to make AI agents work productively for refactors, rewrites &#x2013; or even adding new features and verifying that things did not break!</p><p><strong>Vendors will start to deploy &#x201C;migration AI agents&#x201D; to move customers over to their own stacks.&#xA0;</strong>This got lost in Cloudflare&#x2019;s announcement, but it&#x2019;s&#xA0;<a href="https://github.com/cloudflare/vinext?ref=blog.pragmaticengineer.com">important</a>:</p><blockquote>vinext includes an Agent Skill that handles migration for you. It works with Claude Code, OpenCode, Cursor, Codex, and dozens of other AI coding tools. Install it, open your Next.js project, and tell the AI to migrate:<br><br><em>&gt; npx skills add cloudflare/vinext</em><br><br>Then open your Next.js project in any supported tool and say:<br><br><em>&gt; migrate this project to vinext</em><br><br>The skill handles compatibility checking, dependency installation, config generation, and dev server startup. It knows what vinext supports and will flag anything that needs manual attention.</blockquote><p>This is very clever from Cloudflare, and a true &#x201C;AI-native&#x201D; move. They have not only used AI to migrate Next.js, but also built an &#x201C;AI plugin&#x201D; (a skill) to help customers migrate their existing codebases over to vinext &#x2013; and deploy on Cloudflare!</p><p>This move will surely be copied by other vendors, since migrations which are tedious for humans are much less effort with agents.</p><p><strong>AI is making the tech industry more ruthless when it comes to business practices.&#xA0;</strong>Laura Tacho said something interesting at The Pragmatic Summit:</p><blockquote>&#x201C;AI is an accelerator, it&#x2019;s a multiplier, and it is moving organizations in different directions.&#x201D;</blockquote><p>AI seems to be accelerating the ruthlessness of competition for customers and the speed at which this happens. In one week, Cloudflare rebuilt Next.js, and it&#x2019;s attacking Vercel full-on: claiming their &#x201C;vibe coded&#x201D; alternative is more performant and production-ready, and burying at the foot of the launch announcement the crucial information that vinext is very much experimental.</p><p>I sense vendors are realizing that there&#x2019;s a limited amount of time in which to use AI to their advantage, and some will decide to use it like Cloudflare has.</p><p><strong>On the other hand, AI could be great news for non-commercial open source.&#xA0;</strong>AI presents as a threat to commercial open source because it removes existing moats which make code hard to fully rewrite. However, beyond that, AI could help non-commercial open source to thrive:</p><ul><li>With AI, it&#x2019;s easy to fork an open source project and keep the fork in-sync with the original.</li><li>It&#x2019;s trivial to instruct AI to rewrite an open source project to another language or framework.</li><li>&#x2026;and it&#x2019;s equally trivial for AI to add features to a fork.</li></ul><p>For these reasons, I believe there could be a lot more forks and rewrites to come, and more open source projects and code, in general.</p><h2 id="takeaways"><br>Takeaways</h2><p>Personally, I could not have imagined things changing this quickly in software. Rewriting Next.js in a single week, even to a version that is not quite there &#x2013; but mostly works? This was out of the question as recently as a few months ago.</p><p>Things changed around last December, when Opus 4.5 and GPT-5.2 came out and proved capable&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com">of writing most of the code</a>. What used to be expensive is now cheap &#x2013; like rewriting complete projects &#x2013; and we still need to learn what the &#x201C;new&#x201D; expensive parts of software engineering are.</p><p>All this is new territory for everyone. To succeed in the tech industry, you need to be able to capitalize upon change, as Cloudflare has clearly done in this case by making the most of an opportunity created by new technology. It&#x2019;s unclear how popular vinext will become, and how much of a moat Vercel has around the broader Next.js ecosystem, but I suspect that it&#x2019;d take more than a Next rewrite to make Cloudflare into a viable Next.js platform-as-a-service provider.</p>]]></content:encoded></item><item><title><![CDATA[I replaced a $120/year micro-SaaS in 20 minutes with LLM-generated code]]></title><description><![CDATA[ I used to pay $120/year for a SaaS that hasn’t added new features in four years, and didn’t fix its broken billing system for three years. Using an LLM, I managed to rewrite all the functionality I used to pay for in 20 minutes. Is this bad news for “write once, don’t update later” SaaS?]]></description><link>https://blog.pragmaticengineer.com/i-replaced-a-120-year-micro-saas-in-20-minutes-with-llm-generated-code/</link><guid isPermaLink="false">697ba13c7779050001e3775d</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 29 Jan 2026 18:41:45 GMT</pubDate><content:encoded><![CDATA[<p>I have been sceptical of the manifold claims that software-as-a-service (SaaS) will be killed by LLMs. The theory behind this idea is:</p><ol><li>SaaS is a pure software product. People who pay SaaS vendors do so because it&#x2019;s cheaper to buy this software than build it.</li><li>LLMs dramatically reduce the time and cost of building custom software.</li><li>Therefore, most SaaS vendors will go out of business because most companies/teams will prompt an LLM to write the software they need, such as for ticketing, meetings, customer relationship management, etc.</li></ol><p>The reason for my scepticism has been that SaaS such as HR software Workday is&#xA0;<em>more</em>&#xA0;than just software. Workday, for example, keeps up with compliance requirements (e.g., for holiday pay in different countries), guarantees correctness (e.g., payslips that comply with local regulations), and over time the software keeps up to date with changes in the external and internal environments.</p><p><strong>However, this week I had first-hand experience of how ridiculously easy it is now to replace SaaS with LLMs.&#xA0;</strong>On my website &#x2013;&#xA0;<a href="http://pragmaticengineer.com/?ref=blog.pragmaticengineer.com">pragmaticengineer.com</a>&#xA0;&#x2013; I have a testimonials section, which displays real LinkedIn and X posts about this publication. It cost $120/year for a small service called&#xA0;<a href="https://shoutout.io/?ref=blog.pragmaticengineer.com">Shoutout.io</a>, and looked like this:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image.png" class="kg-image" alt loading="lazy" width="1390" height="1120" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/01/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/01/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image.png 1390w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Testimonials, nicely collected and rendered by Shoutout</em></i></figcaption></figure><p>And this is the backend: nothing fancy, just a way to add, edit, reorganize, and delete testimonials.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-1.png" class="kg-image" alt loading="lazy" width="1456" height="922" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/01/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/01/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-1.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Shoutout&#x2019;s admin interface</span></figcaption></figure><p>I was a customer for four years and logged in perhaps once a year. My latest login was to get an annual invoice for my expenses. Unfortunately, the billing section was broken, so I emailed support and they sent me a broken link instead of the invoice. This was frustrating: why pay for a SaaS with broken billing? I couldn&#x2019;t even tell what they would charge me next year.</p><p><strong>So I asked myself if I could rebuild my own use case with an LLM, and do it rapidly.&#xA0;</strong>My use case was much simpler than the SaaS itself:</p><ul><li>Display existing testimonials in a similar way</li><li>Make it easy to add new ones, e.g., store testimonials in some JSON format</li><li>Make it look good</li></ul><p>To my surprise, this whole effort from start to finish took exactly 20 minutes with Codex. The steps I took were straightforward enough:</p><ul><li>Asked Codex to make a plan on how to remove this third-party dependency and host all testimonials in my codebase (a GitHub repo, deployed onto Netlify)</li><li>Tweaked the plan: I pushed for a modular approach where testimonials are in a separate JSON file, and they get generated into HTML with a compile-time build step</li><li>Added this build step both locally and as a build trigger on Netlify</li><li>Tested the solution</li><li>Tweaked the UX and generated a schema</li><li>Deployed it</li></ul><p>The end result is visually the same as before, except I no longer have a third-party dependency rendering all of this!</p><h3 id="what-does-this-mean-for-saas-products-and-software-engineers">What does this mean for SaaS products and software engineers?</h3><p>What it means for software engineers:</p><ul><li><strong>Devs are (probably) a lot more comfortable using the command line for future updates than regular users.&#xA0;</strong>To add a future testimonial, I&#x2019;ll need to turn to my AI agent to insert it in my codebase, and I&#x2019;ll then need to verify that it looks good. This is not a big deal for me, but it might be a dealbreaker for someone not comfortable with verifying the code output of an LLM.</li><li><strong>It&#x2019;s a lot faster for a dev to &#x201C;port&#x201D; a SaaS than for anyone else.&#xA0;</strong>I first told Codex to copy the UI and it got things wrong because it tried to use a flexbox model. I had to tell it that this UI layout was not what I wanted, and then make the decision on which framework to use for the UI layout. A non-developer could probably figure all this out, but it would take longer.</li><li><strong>Honestly, it&#x2019;s fun and interesting to rewrite a third-party feature. I recommend it.&#xA0;</strong>Part of why I took on this project is because I expected it to be an interesting challenge. I thought the effort would be more than what it was, and I&#x2019;ve learned more about how well these tools work. I also used Codex in order to experience it more.</li></ul><p>What this could mean for SaaS software:</p><ul><li><strong>Rebuilding a SaaS still feels much harder than rebuilding&#xA0;<em>your specific</em>&#xA0;use case.&#xA0;</strong>I did not &#x201C;rebuild&#x201D; Shoutout in any way. Shoutout has 10x or more features, like adding quotes from 10 different platforms, authentication, billing (which didn&#x2019;t work for me), and more.</li><li><strong>A SaaS that doesn&#x2019;t give ongoing value is at risk of being replaced by customers.&#xA0;</strong>Shoutout doesn&#x2019;t provide ongoing value after it displays my testimonials, and this static nature means it&#x2019;s easy to replace. In contrast, it would be harder to rebuild if I paid for the platform to stay compliant, provide analytics or alerting, and do other real-time things that helped my business.</li><li><strong>Buying and selling SaaS businesses could become less profitable.&#xA0;</strong>The original version of Shoutout that I signed up for in 2021 was built in 2020 by an independent developer. In 2022, this developer&#xA0;<a href="https://www.indiehackers.com/post/my-startup-shoutout-has-been-acquired-0350ae659c?ref=blog.pragmaticengineer.com">sold this micro-SaaS</a>&#xA0;to a product studio. Then, in 2025, Shoutout&#xA0;<a href="https://x.com/davidsonkyle/status/1942207611006542317?s=20&amp;ref=blog.pragmaticengineer.com">was sold</a>&#xA0;again to new developers. From my point of view, nothing changed except that the billing system broke. I assume the buyers of this SaaS figured that revenue could keep rising with zero investment. But perhaps at some point that ceases to be true when people get fed up with a broken product and quit &#x2013; especially when doing so is cheaper.</li></ul><p><strong>&#x201C;Broken windows&#x201D; not being fixed is less acceptable than it used to be.&#xA0;</strong>My journey away from Shoutout began with its billing system being broken. For example, below is what I saw when I went to my billing section to see the invoices:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-2.png" class="kg-image" alt loading="lazy" width="1220" height="428" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/01/image-2.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/01/image-2.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-2.png 1220w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">A trigger to quit: Billing had been broken since 2023 and was never fixed</span></figcaption></figure><p>As well as this, the customer support sent me a broken link in response to my email. That was enough for me to decide to replace this dependency, and I was surprised by how easy this was with an LLM and knowing what I wanted it to build.&#xA0;<em>By the time customer support sent me a working link two hours later, I had finished migrating off the SaaS.</em></p>]]></content:encoded></item><item><title><![CDATA[The grief when AI writes most of the code]]></title><description><![CDATA[When AI writes almost all code, what happens to software engineering? There is grief involved for us developers, that's for sure.]]></description><link>https://blog.pragmaticengineer.com/the-grief-when-ai-writes-most-of-the-code/</link><guid isPermaLink="false">695eab59af96490001536b9c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Wed, 07 Jan 2026 18:53:57 GMT</pubDate><content:encoded><![CDATA[<p>I&#x2019;m coming to terms with the high probability that AI will write most of&#xA0;<em>my</em>&#xA0;code which I ship to prod, going forward. It already does it faster, and with similar results to if I&#x2019;d typed it out. For languages/frameworks I&#x2019;m less familiar with, it does a better job than me.</p><p>It feels like something valuable is being taken away, and suddenly. It took a&#xA0;<em>lot</em>&#xA0;of effort to get good at coding and to learn how to write code that works, to read and understand complex code, and to debug and fix when code doesn&#x2019;t work as it should. I still remember how daunting my first &#x201C;real&#x201D; programming class was at university (learning C), how lost I felt on my first job with a complex codebase, and how it took years of practice, learning from other devs, books, and blogs, to get better at the craft. Once you&#x2019;re pretty good, you have something that&#x2019;s valuable and easy to validate by writing code that works!</p><p>Some of my best memories of building software are about coding. Being &#x201C;locked in&#x201D; and balancing several ideas while typing them out, of being in the zone, then compiling the code, running it and seeing that &#x201C;<em>YES&#x201D;,</em>&#xA0;it worked as expected!</p><p>It&#x2019;s been a love-hate relationship, to be fair, based on the amount of focus needed to write complex code. Then there&#x2019;s all the conflicts that time estimates caused: time passes differently when you&#x2019;re locked in and working on a hard problem.</p><p>Now, all that looks like it will be history.</p><p>I wonder if I&#x2019;ll still get the same sense of satisfaction from the fact that writing complicated code is&#xA0;<em>hard</em>? Yes, AI is convenient, but there&#x2019;s also a loss.</p><p>Or perhaps with AI agents, being &#x201C;in the zone&#x201D; will shift to thinking about higher-level problems, while instructing more complex code to be written?</p><hr><p>This was a section from my analysis piece <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com" rel="noreferrer">When AI writes almost all code, what happens to software engineering?</a>. Read the full one <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com" rel="noreferrer">here</a>.</p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare’s latest outage proves dangers of global configuration changes (again)]]></title><description><![CDATA[Deja vu: a large Cloudflare outage caused by an instantly rolled-out global config change – two weeks after a similar problem]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflares-latest-outage/</link><guid isPermaLink="false">69443c5d272393000120055e</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 18 Dec 2025 17:44:21 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><em><u>last week&#x2019;s The Pulse</u></em></a><em> issue. Full subscribers received the below article seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em> <u>subscribe here</u></em></a><em>.</em></p><p>A mere two weeks after <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Cloudflare suffered a major outage</a> and took down half the internet, the same thing has happened again. Last Friday, 5th December, thousands of sites went down or partially down once more, in a global Cloudflare outage lasting 25 minutes.</p><p>As per last time, Cloudflare was speedy to share <a href="https://blog.cloudflare.com/5-december-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a full postmortem</a> on the same day. It estimated that 28% of Cloudflare&#x2019;s HTTP traffic was impacted. The cause of this latest outage was Cloudflare making a seemingly innocent &#x2013; but <em>global</em> &#x2013; configuration change that went on to take out a good portion of Cloudflare, <em>globally</em>, until being reverted. Here&#x2019;s what happened:</p><ul><li>Cloudflare was rolling out a fix for a nasty React security vulnerability</li><li>The fix caused an error in an internal testing tool</li><li>The Cloudflare team disabled the testing tool with a global killswitch</li><li>As this global configuration change was made, the killswitch unexpectedly caused a bug that resulted in HTTP 500 errors across Cloudflare&#x2019;s network</li></ul><p><strong>In this latest outage, Cloudflare was burnt by yet another global configuration change. </strong>The previous outage <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in November</a> happened thanks to a global database permissions change. In the postmortem of that incident, the Cloudflare team closed with this action item:</p><blockquote>&#x201C;Hardening ingestion of Cloudflare-generated configuration files in the same way we would for user-generated input&#x201D;</blockquote><p>This change would make it so that Cloudflare&#x2019;s configuration files do not propagate immediately to the full network, as they still do now. But making <em>all</em> global configuration files have staged rollouts is a large implementation that could take months. Evidently, there wasn&#x2019;t time to make it yet, and it has come back to bite Cloudflare.</p><p>Unfortunately for Cloudflare, customers are likely to find unacceptable a second outage with similar causes to a previous one, only weeks ago. If Cloudflare proves unreliable, customers should plan to onboard to <em>backup</em> CDNs at the very least, and a backup CDN vendor will do its best to convince new customers to use it as the primary CDN.</p><p>Cloudflare&#x2019;s value-add rests on rock-solid reliability without customers needing to budget for a backup CDN. Yes, publishing postmortems on the same day as an outage occurs helps restore trust, but that will crumble anyway with repeated large outages.</p><p><strong>To be fair, the company is doubling down on implementing staged configuration rollouts. </strong>In its postmortem, Cloudflare is its own biggest critic. CTO Dane Knecht <a href="https://blog.cloudflare.com/5-december-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reflected</a>:</p><blockquote>&#x201C;[Global configuration changes rolling out globally] remains our first priority across the organization. In particular, the projects outlined below should help contain the impact of these kinds of changes:<strong>Enhanced Rollouts &amp; Versioning:</strong> Similar to how we slowly deploy software with strict health validation, data used for rapid threat response and general configuration needs to have the same safety and blast mitigation features. This includes health validation and quick rollback capabilities among other things.<strong>Streamlined break glass capabilities: </strong>Ensure that critical operations can still be achieved in the face of additional types of failures. This applies to internal services as well as all standard methods of interaction with the Cloudflare control plane used by all Cloudflare customers.<strong>&#x201C;Fail-Open&#x201D; Error Handling: </strong>As part of the resilience effort, we are replacing the incorrectly applied hard-fail logic across all critical Cloudflare data-plane components. If a configuration file is corrupt or out-of-range (e.g., exceeding feature caps), the system will log the error and default to a known-good state or pass traffic without scoring, rather than dropping requests. Some services will likely give the customer the option to fail open or closed in certain scenarios. This will include drift-prevention capabilities to ensure this is enforced continuously.<br>These kinds of incidents, and how closely they are clustered together, are not acceptable for a network like ours&#x201D;.</blockquote><h3 id="global-configuration-errors-often-trigger-large-outages">Global configuration errors often trigger large outages</h3><p>There&#x2019;s a pattern of implicit or explicit global configuration errors causing large outages, and some of the biggest ones in recent years were caused by a single change being rolled out to a whole network of machines:</p><ul><li><strong>DNS and DNS-related systems like BGP:</strong> DNS changes are global by default, so it&#x2019;s no wonder that DNS changes can cause global outages. Meta&#x2019;s <a href="https://en.wikipedia.org/wiki/2021_Facebook_outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">7-hour outage in 2021</a> was related to DNS changes (more specifically, Border Gateway Protocol changes.) Meanwhile, the AWS outage in October <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">started with</a> the internal DNS system.</li><li><strong>OS updates happening at the same time, globally: </strong>Datadog&#x2019;s <a href="https://newsletter.pragmaticengineer.com/p/inside-the-datadog-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">2023 outage</a> cost the company $5M and was caused by Datadog&#x2019;s Ubuntu machines executing an OS update within the same time window, globally. It caused issues with networking, and it didn&#x2019;t help that Datadog ran its infra on 3 different cloud providers across 3 networks. The same kind of Ubuntu update also <a href="https://newsletter.pragmaticengineer.com/p/why-reliability-is-hard-at-scale?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">caused a global outage</a> for Heroku in 2024.</li></ul><p><strong>Globally replicating configs: </strong><a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in 2024</a>, a configuration policy change was rolled out globally and crashed every Spanner database node straight away. As Google concluded in <a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">its postmortem</a>: &#x201C;Given the global nature of quota management, this metadata was replicated globally within seconds&#x201D;.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/12/image.png" class="kg-image" alt loading="lazy" width="1456" height="970" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/12/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/12/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/12/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Step 2 &#x2013; replicating a configuration file globally across GCP &#x2013; </em></i><a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">caused a global outage</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> in 2024</em></i></figcaption></figure><p>Implementing gradual rollouts for <em>all</em> configuration files is a <em>lot</em> of work. It&#x2019;s also invisible labor because when done well, then its benefits will be undetectable, except in the absence of incidents, thanks to better infrastructure!</p><p><strong>The largest systems in the world will likely have to implement safer ways to roll out configs &#x2013; but not everybody needs to. </strong>Staged configuration rollout doesn&#x2019;t make much sense for smaller companies and products because this infra work slows down product development.</p><p>It doesn&#x2019;t just slow down building, but every deployment, too, and this friction is designed to make everything slower. As such, they don&#x2019;t make much sense unless the stability of mature systems is more important than fast iteration.</p><p>Software engineering is a field where tradeoffs are a fact of life, and universal solutions don&#x2019;t exist. The development which worked for a system with 1/100th of the load and users a year ago, may not make sense today.</p><p><em>This was one out of the four topics covered in this week&#x2019;s The Pulse. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>The full edition</em></a><em> additionally covers:</em></p><ol><li><strong>Industry Pulse.&#xA0;</strong>Poor capacity planning at AWS, Meta moves to a &#x201C;closed AI&#x201D; approach, a looming RAM shortage, early-stage startups hiring slower than before, how long it takes to earn $600K at Amazon and Meta, Apple loses execs to Meta, and more</li><li><strong>How the engineering team at Oxide uses LLMs.&#xA0;</strong>They find LLMs great for reading documents and lightweight research, mixed for coding and code review, and a poor choice for writing documents &#x2013; or any kind of writing, really!</li><li><strong>Linux officially supports Rust in the kernel.&#xA0;</strong>Rust is now a first-class language inside the Linux kernel, eight months after a Linux Foundation Fellow&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/how-linux-is-built-with-greg-kroah?ref=blog.pragmaticengineer.com">predicted</a>&#xA0;more support for Rust. A summary of the pros and cons of Rust support for Linux</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><strong>Read the full The Pulse issue</strong></a><strong>.</strong></p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Could a 5-day RTO be around the corner for Big Tech?]]></title><description><![CDATA[From next February, workers at Instagram must be in the office, five days a week. This makes Meta the second tech giant after Amazon to mandate a 5-day RTO. Will more big companies do the same?]]></description><link>https://blog.pragmaticengineer.com/the-pulse-could-a-5-day-rto-be-around-the-corner-for-big-tech/</link><guid isPermaLink="false">693b1247dd0e8a0001c79f46</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Sat, 13 Dec 2025 15:21:25 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-155?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p>A year ago, Amazon became the first tech giant to bring staff back into the office for the full five days per week. Back then, I&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/149104874/what-does-amazons-day-rto-mean-for-tech?ref=blog.pragmaticengineer.com">analyzed</a>&#xA0;the reasons for the change, and whether other workplaces would follow suit by dropping the widespread hybrid policy of 2-3 days/week in the office.</p><p>Now, Meta employees in the Instagram division have become the latest subjects of a full return to the office, following an announcement by the social media platform this week.</p><h3 id="instagram%E2%80%99s-5-day-return-to-office">Instagram&#x2019;s 5-day return to office</h3><p>Instagram employees&#xA0;<a href="https://sources.news/p/instagrams-return-to-office-mandate?ref=blog.pragmaticengineer.com">received the unexpected email on Monday</a>, reports fellow Substacker, Alex Heath, who acquired a copy of the message. It was sent internally by Instagram CEO Adam Mosseri, who wrote:</p><blockquote>&#x201C;<strong>1. Back to the office:</strong>&#xA0;I believe that we are more creative and collaborative when we are together in-person. (...)<br><br><strong>2. Fewer meetings:</strong>&#xA0;We all spend too much time in meetings that are not effective, and it&#x2019;s slowing us down. Every six months, we&#x2019;ll cancel all recurring meetings and only re-add the ones that are absolutely necessary (...)<br><br><strong>3. More demos, less decks:</strong>&#xA0;Most product overviews should be prototypes instead of decks.<br><br><strong>4. Faster decision-making:</strong>&#xA0;We&#x2019;re going to have a more formalized unblocking process with DRIs, and I&#x2019;ll be at the priorities progress unblocking meeting every week.&#x201D;</blockquote><p>This decision by Meta affects around a quarter of company staff, and it&#x2019;s hard to imagine other divisions not following Instagram&#x2019;s lead; after all, everything in Mosseri&#x2019;s memo likely applies across the business.</p><p>Five years ago, CEO Mark Zuckerberg predicted 50% of Meta staff would work remotely by now, which didn&#x2019;t happen. Indeed, with Instagram&#x2019;s new 5-day RTO, I&#x2019;d be surprised if 5% of Meta folks work remotely in two years&#x2019; time.</p><p><strong>The reason for Insta&#x2019;s RTO seems rooted in the leadership&#x2019;s belief that in-office is more productive,&#xA0;</strong>as indicated by the top bullet point of Mosseri&#x2019;s message. That message in full:</p><p>&#x201C;I believe that we are more creative and collaborative when we are together in-person. I felt this pre-COVID and I feel it any time I go to our New York office where the in-person culture is strong.</p><p>Starting February 2, I&#x2019;m asking everyone in my rollup based in a US office with assigned desks to come back full time (five days a week). The specifics:</p><ul><li>You&#x2019;ll still have the flexibility to work from home when you need to, since I recognize there will be times you won&#x2019;t be able to come into the office. I trust you all to use your best judgment in figuring out how to adapt to this schedule.</li><li>In the NY office, we won&#x2019;t expect you to come back full time until we&#x2019;ve alleviated the space constraints. We&#x2019;ll share more once we have a better sense of timeline.</li><li>In MPK [Menlo Park, the HQ], we&#x2019;ll move from MPK21 to MPK22 on January 26 so everyone has an assigned desk. We&#x2019;re also offering the option to transfer from the MPK to SF office for those people whose commute would be the same or better with that change. We&#x2019;ll reach out directly to those people with more info.</li><li>XFN [cross-functional] partners will continue to follow their own org norms.</li><li>There is no change for employees who are currently remote&#x201D;.</li></ul><p>From what I&#x2019;ve seen of Mosseri from afar, he seems like a pretty straight shooter. It&#x2019;s clear that he feels in-office creates more energy, and in Mosseri&#x2019;s defense, I hear similar from many startup founders and leaders who say remote work causes a bunch of headaches: it&#x2019;s harder to spot motivational problems and performance issues, information travels more slowly, and rallying teams is harder.</p><p><strong>There&#x2019;s no doubt that running a full-remote company is a lot of effort.&#xA0;</strong>There&#x2019;s often-overlooked labor involved in hiring, onboarding, performance management, team celebrations, and even company-wide meetings &#x2013; none of it is easy.</p><p>Linear is a full-remote company with nearly 50 people working there, which&#xA0;<a href="https://linear.app/now/designing-remote-work-at-linear?ref=blog.pragmaticengineer.com">recently published details about how it operates</a>. They&#x2019;re introducing the concept of &#x201C;coworking hubs&#x201D;, flying in teams for in-person events, and holding regular off-sites, while being careful to hire people who fit the culture.</p><p><strong>My feeling is that remote work policies at tech companies are going to become questions of their leaders&#x2019; preferences.&#xA0;</strong>Many devs prefer remote work: there&#x2019;s fewer interruptions, more deep focus, and less commuting. Most of us would probably be just as productive &#x2013; and probably more so &#x2013; than when being interrupted in-office.</p><p>Leaders who prefer full-remote can cite flexibility and easier hiring from a larger pool of candidates as clear benefits. Meanwhile, those most comfortable with in-person will always have enough reasons to justify a 5-day RTO, along the lines of Mosseri&#x2019;s reasoning. Advocates of hybrid setups cite balancing of focus time and efficiency.</p><p>In today&#x2019;s job market, any company that pays closer to the top of the market can probably get away with five-days-a-week RTO. Meta is in this space, and although I&#x2019;m sure plenty of devs will dislike the change, the alternative is to go out on the job market, accept a pay cut to join a new company, and start rebuilding your internal network.</p><p>Since we&#x2019;re in the&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025?ref=blog.pragmaticengineer.com">midst of a weird job market</a>, it makes switching jobs more difficult than before, when the job market was very hot. In this respect, Instagram has external conditions on its side. For devs at Meta, one upside is that Big Tech experience&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/tech-jobs-market-2025-part-3?ref=blog.pragmaticengineer.com">opens more doors</a>, even in this tough job market.</p><p>One caveat is that a 5-day RTO is unlikely in places where it&#x2019;s hard to hire the right people. So, AI engineers and those working on AI products should be pretty safe, for instance, because those roles are&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/172584839/ai-engineering-trends?ref=blog.pragmaticengineer.com">incredibly in-demand</a>, as indicated by the&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/165280420/new-trend-higher-base-salaries-for-ai-engineers?ref=blog.pragmaticengineer.com">trend of higher base salaries for AI engineers</a>. Based on that, few companies should want to push those workers to quit to join competitors.</p><p></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/" rel="noopener noreferrer nofollow"><em> an email you could send to your manager</em></a><em>.</em></p>]]></content:encoded></item><item><title><![CDATA[Downdetector and the real cost of no upstream dependencies]]></title><description><![CDATA[During the Cloudflare outage, Downdetector was also unavailable. I got details from the team about why they have a hard dependency on Cloudflare, and why that won’t change anytime soon.]]></description><link>https://blog.pragmaticengineer.com/downdetector-and-the-real-cost-of-no-upstream-dependencies/</link><guid isPermaLink="false">6932a20b097ffa00013da35c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 05 Dec 2025 09:14:50 GMT</pubDate><content:encoded><![CDATA[<p><em>The below is one out of five topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>The Pulse #154.</em></a><em> Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em><u>subscribe here</u></em></a><em>.</em></p><p><em>Many subscribers expense The Pragmatic Engineer Newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em><u> an email you could send to your manager</u></em></a><em>.</em></p><hr><p>One amusing detail of the <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noreferrer">November 2025 Cloudflare outage</a> is that the realtime outage and monitoring service, Downdetector, went down, revealing a key dependency on Cloudflare. At first, this looks odd; after all, Downdetector is about monitoring uptime, so why would it take on a key dependency like Cloudflare if it means this can happen?</p><p><strong>Downdetector was built multi-region and multi-cloud,</strong>&#xA0;which<strong>&#xA0;</strong>I confirmed by talking with Senior Director of Engineering,&#xA0;<a href="https://x.com/damndhruv?ref=blog.pragmaticengineer.com">Dhruv Arora</a>, at Ookla, the company behind Downdetector. Multi-cloud resilience makes little sense for most products, but Downdetector was built to detect cloud provider outages, as well. And for this, they needed to be multi-cloud!</p><p>Still, Downdetector uses Cloudflare for DNS, Content Delivery (CDN), and Bot Protection. So, why would it take on this one key dependency, as opposed to hosting everything on its own servers?</p><p><strong>A CDN has advantages that are hard to ignore,&#xA0;</strong>such as:</p><ul><li>Drastically lower bandwidth costs &#x2013; assets cached on the CDN are much faster</li><li>Faster load times because assets on a CDN are served from Edge nodes nearer users</li><li>Protection from sudden traffic spikes, as would be common for Downdetector, especially during outages! Without a CDN, those spikes could overload their services</li><li>DDoS protection from bad actors taking the site offline with a distributed denial of service attack</li><li>Reduced infrastructure requirements, as Downdetector can run on fewer servers</li></ul><p>Downdetector&#x2019;s usage patterns reflect that it&#x2019;s a service very heavily used by consumers whom the business doesn&#x2019;t really monetize (Downdetector is free to use.) So, Downdetector could get rid of Cloudflare, but costs would surge, the site would become slower to load, and revenue wouldn&#x2019;t change.</p><p>In the end, Downdetector&#x2019;s dependence on Cloudflare could be a pragmatic choice based on the business model, and how removing its upstream dependency upon Cloudflare could get very expensive!</p><p>Dhruv confirmed this and sharing more about the design choices at Downdetector:</p><blockquote>&#x201C;<strong>Building redundancy at the DNS &amp; CDN layers would require enormous overhead.</strong>&#xA0;This is especially true as Cloudflare&#x2019;s Bot Protection is world-class, and building similar functionality would be a lot of effort. There are hyperscalers [cloud providers] that have this kind of redundancy built in. We will look into what we can do, but with a team size in the double digits, building up a core piece of infra like this is a pretty tall order: not just for us, but for any mid-sized team.<br><br>We&#x2019;ve learned that there are more things that we can improve, for the future. For example, during the outage, the Cloudflare control pane was down, but their API wasn&#x2019;t. So, us having more Infrastructure as Code could have helped bring back Downdetector sooner.<br><br>On our end, we also noticed that the outage wasn&#x2019;t global, so we were able to shift traffic around and reduce the impact.<br><br>One more interesting detail: Cloudflare&#x2019;s Bot Protection went haywire during the outage, and started to block legitimate traffic. So, our team had to turn that off temporarily&#x201D;.</blockquote><p>Thanks very much to Dhruv and the Downdetector team for sharing details.</p>]]></content:encoded></item><item><title><![CDATA[A startup in Mongolia translated my book]]></title><description><![CDATA[A 30-person startup called Nasha Tech translated The Software Engineer's Guidebook for the benefit of their company and the Mongolian tech ecosystem.]]></description><link>https://blog.pragmaticengineer.com/traveling-to-mongolia/</link><guid isPermaLink="false">69206cafc3b7150001d419bf</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 21 Nov 2025 13:47:17 GMT</pubDate><content:encoded><![CDATA[<p>I published <a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer">The Software Engineer&apos;s Guidebook</a> two years ago. <em> I shared more details on how I self-published the book, and the learnings from publishing </em><a href="https://newsletter.pragmaticengineer.com/p/the-software-engineers-guidebook?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>in this post.</em></a></p><p>An unexpected highlight of publishing the book was ending up in Mongolia in June of this year, at a small-but-mighty startup called <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech</a>. This was because the startup translated my book into Mongolian. Here&apos;s the completed book:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-21-at-15.34.01.png" class="kg-image" alt loading="lazy" width="1078" height="1292" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/Screenshot-2025-11-21-at-15.34.01.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/Screenshot-2025-11-21-at-15.34.01.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-21-at-15.34.01.png 1078w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Software Engineer&apos;s Guidebook, in Mongolian. You can </span><a href="https://internom.mn/%D0%B1%D0%B0%D1%80%D0%B0%D0%B0/9789919053185-%D1%81%D0%BE%D1%84%D1%82%D0%B2%D1%8D%D0%B9%D1%80-%D0%B8%D0%BD%D0%B6%D0%B5%D0%BD%D0%B5%D1%80%D0%B8%D0%B9%D0%BD-%D1%85%D3%A9%D1%82%D3%A9%D1%87-%D0%BD%D0%BE%D0%BC?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;">buy this translation here</span></a></figcaption></figure><p>Here&#x2019;s what happened:</p><p>A little over a year ago, a small startup from Mongolia reached out, asking if they could translate the book. I was skeptical it would happen because the unit economics appeared pretty unfavorable. Mongolia&#x2019;s population is 3.5 million; much smaller than other countries where professional publishers had offered to do a translation (Taiwan: 23M, South Korea: 51M, Germany: 84M, Japan: 122M, China: 1.43B people).</p><p>But I agreed to the initiative, and expected to hear nothing back. To my surprise, nine months later the translation was ready, and the startup printed 500 copies on the first run. They invited me to a book signing in the capital city of Ulaanbaatar, and soon I was on my way to meet the team, and to understand why a small tech company translated my book!</p><h3 id="japanese-startup-vibes-in-mongolia">Japanese startup vibes in Mongolia</h3><p>The startup behind the translation is called <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech</a>; a mix of a startup and a digital agency. Founded in 2018, its main business has been agency work, mainly for companies in Japan. They are a group of 30 people, mostly software engineers.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-1.png" class="kg-image" alt loading="lazy" width="1086" height="1264" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-1.png 1086w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Nasha Tech&#x2019;s offices in Ulaanbaatar, Mongolia</span></figcaption></figure><p>Their offices resembled a mansion more than a typical workplace, and everyone takes their shoes off when arriving at work and switches to &#x201C;office slippers&#x201D;. I encountered the same vibe later <a href="https://newsletter.pragmaticengineer.com/i/177384640/cursor-push-for-release?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">at Cursor&#x2019;s headquarters in San Francisco</a>, in the US.</p><p>Nasha Tech found a niche of working for Japanese companies thanks to one of its cofounders studying in Japan, and building up connections while there. Interestingly, another cofounder later moved to Silicon Valley, and advises the company from afar.</p><p><strong>The business builds the &#x201C;Uber Eats of Mongolia&#x201D;. </strong>Outside of working as an agency, Nasha Tech builds its own products. The most notable is called TokTok, the &#x201C;UberEats of Mongolia&#x201D;, which is the leading food delivery app in the capital city. The only difference between TokTok and other food delivery apps is scale: the local market is smaller than in some other cities. At a few thousand orders per day, it might not be worthwhile for an international player like Uber or Deliveroo to enter the market.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-2.png" class="kg-image" alt loading="lazy" width="1456" height="646" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image-2.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image-2.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-2.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The </em></i><a href="https://www.toktok.mn/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">TokTok</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> app: a customer base of 800K, 500 restaurants, and 400 delivery riders</em></i></figcaption></figure><p>The tech stack Nasha Tech typically uses:</p><ul><li>Frontend: React / Next, Vue / Nuxt, TypeScript, Electron, Tailwind, Element UI</li><li>Backend and API: NodeJS (Express, Hono, Deno, NestJS), Python (FastAPI, Flask), Ruby on Rails, PHP (Laravel), GraphQL, Socket, Recoil</li><li>Mobile: Flutter, React Native, Fastlane</li><li>Infra: AWS, GCP, Docker, Kubernetes, Terraform</li><li>AI &amp; ML: GCP Vertex, AWS Bedrock, Elasticsearch, LangChain, Langfuse</li></ul><p>AI tools are very much widespread, and today the team uses Cursor, GitHub Copilot, Claude Code, OpenAI Codex, and Junie by Jetbrains.</p><p><strong>I detected very few differences between Nasha Tech and other &#x201C;typical&#x201D; startups I&#x2019;ve visited, in terms of the vibe and tech stack. </strong>Devs working on TokTok were very passionate about how to improve the app and reduce the tech debt accumulated by prioritizing the launch. A difference for me was the language and target market: the main language in the office is, obviously, Mongolian, and the products they build like TokTok also target the Mongolian market, or the Japanese one when working with clients.</p><p>One thing I learned was that awareness about the latest tools has no borders: back in June, a dev at Nasha Tech was already telling me that Claude Code was their daily driver, even though the tool had been released for barely a month at that point!</p><h3 id="why-translate-the-book-into-mongolian">Why translate the book into Mongolian?</h3><p>Nasha Tech was the only non-book publisher to express interest in translating the book. But why did they do it?</p><p>I was told the idea came from software engineer <a href="https://x.com/ssuuribaatar?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Suuribaatar Sainjargal</a>, who bought and enjoyed the English-language version. He <a href="https://x.com/GergelyOrosz/status/1937160382600343964?s=20&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">suggested</a> translating the book so that everyone at the company could read it, not only those fluent in English.</p><p>Nasha Tech actually had some in-house experience of translation. A year earlier, in 2024, the company translated Matt Mochary&#x2019;s <a href="https://www.amazon.com/Great-CEO-Within-Tactical-Building-ebook/dp/B07ZLGQZYC?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Great CEO Within</a> as a way to uplevel their leadership team, and to help the broader Mongolian tech ecosystem.</p><p>Also, the company&#x2019;s General Manager, <a href="https://www.linkedin.com/in/battsengel/?originalSubdomain=mn&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Batutsengel Davaa</a>, happened to have been involved in translating more than 10 books in a previous role. He took the lead in organizing this work, and here&#x2019;s how the timelines played out:</p><ul><li>Professional translator: 3 months</li><li>Technical editor revising the draft translation: 1 month</li><li>Technical editing #2 by a Support Engineer in Japan: 2 months</li><li>Technical revision: 15 engineers at Nasha Tech revised the book, with a &#x201C;divide and conquer&#x201D; approach: 2 months</li><li>Final edit and print: 1 month</li></ul><p>This was a real team effort. Somehow, this startup managed to produce a high-quality translation in around the same time as it took professional book publishers in my part of the world to do the same!</p><p>A secondary goal that Nasha Tech had was to advance the tech ecosystem in Mongolia. There&#x2019;s understandably high demand for books in the mother tongue; I observed a number of book stands selling these books, and book fairs are also popular. The translation of my book has been selling well, where you can <a href="https://internom.mn/%D0%B1%D0%B0%D1%80%D0%B0%D0%B0/9789919053185-%D1%81%D0%BE%D1%84%D1%82%D0%B2%D1%8D%D0%B9%D1%80-%D0%B8%D0%BD%D0%B6%D0%B5%D0%BD%D0%B5%D1%80%D0%B8%D0%B9%D0%BD-%D1%85%D3%A9%D1%82%D3%A9%D1%87-%D0%BD%D0%BE%D0%BC?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">buy the book</a> for 70,000 MNTs (~$19).</p><h3 id="book-signing-and-the-mongolian-startup-scene">Book signing and the Mongolian startup scene</h3><p>The book launch event was at Mongolia&#x2019;s startup hub, called <a href="https://digitalnomad.itpark.mn/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">IT Park</a>, which offers space for startups to operate in. I met a few working in the AI and fintech spaces &#x2013; and even one startup producing comics.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-3.png" class="kg-image" alt loading="lazy" width="1378" height="1184" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image-3.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image-3.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-3.png 1378w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Book launch event, and meeting startups inside Mongolia&#x2019;s IT Park</span></figcaption></figure><p>I had the impression that the government and private sector are investing heavily in startups, and want to help more companies to become breakout success stories:</p><ul><li><a href="https://digitalnomad.itpark.mn/ds_in_mongolia?ref=blog.pragmaticengineer.com#ds" rel="noopener noreferrer nofollow">IT Park report</a>: the country&#x2019;s tech sector is growing ~20%, year-on-year. The <em>combined</em> valuation of all startups in Mongolia is at $130M, today.<em> It&#x2019;s worth remembering that location is important for startups: being in hubs like the US, UK, and India confers advantages that can be reflected in valuations.</em></li><li><a href="https://www.jica.go.jp/overseas/mongolia/sjp04ove1698/__icsFiles/afieldfile/2024/08/28/Summary.pdf?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Mongolian Startup Ecosystem Report 2023</a>: the average pre-seed valuation of a startup in Mongolia is $170K, seed valuation at $330K, and Series A valuation at $870K. The numbers reflect market size; for savvy investors, this could also be an opportunity to invest early. I met a Staff Software Engineer at the book signing event who is working in Silicon Valley at Google, and invests and advises in startups in Mongolia.</li><li><a href="https://drive.google.com/file/d/1Ath-eOMd4Kr924cq1AkgLekfeJlXCBfd/view?usp=sharing&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Mongolian startup ecosystem Map</a>: better-known startups in the country.</li></ul><p>Two promising startups from Mongolia: <a href="https://chimege.com/en/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Chimege</a> (an AI+voice startup) <a href="https://and.global/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">AND Global</a> (fintech). Thanks very much to the <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech team</a> for translating the book &#x2013; keep up the great work!</p><h2 id></h2>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare takes down half the internet – but shares a great postmortem]]></title><description><![CDATA[A database permissions change ended up knocking Cloudflare’s proxy offline. Pinpointing the root cause was tricky – but Cloudflare shared a detailed postmortem. Also: announcing The Pragmatic Summit]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflare-takes-down-half-the-internet/</link><guid isPermaLink="false">691f7b63e9904f00015006db</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 20 Nov 2025 20:36:19 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of five topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com"><em>this week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em>subscribe here</em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em> an email you could send to your manager</em></a><em>.</em></p><hr><p>Before we start: I&#x2019;m excited to share something new: <strong>The Pragmatic Summit.</strong></p><p>Four years ago, The Pragmatic Engineer started as a small newsletter: me writing about topics relevant for engineers and engineering leaders at Big Tech and startups. Fast forward to today, and the newsletter <a href="https://newsletter.pragmaticengineer.com/p/one-million?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">crossed one million readers</a>, and the publication expanded with <a href="https://newsletter.pragmaticengineer.com/podcast?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a podcast</a> as well.</p><p>One thing that was always missing: meeting in person. Engineers, leaders, founders&#x2014;people who want to meet others in this community, and learn from each other. Until now that is:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/TPS_Social_RegLive_1200x627_110625.png" class="kg-image" alt loading="lazy" width="1200" height="627" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/TPS_Social_RegLive_1200x627_110625.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/TPS_Social_RegLive_1200x627_110625.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/TPS_Social_RegLive_1200x627_110625.png 1200w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Pragmatic Summit. </span><a href="https://www.pragmaticsummit.com/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><span style="white-space: pre-wrap;">See more details and apply to attend</span></a></figcaption></figure><p>In partnership with <a href="http://statsig.com/pragmatic?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Statsig</a>, I&#x2019;m hosting the first-ever <a href="https://www.pragmaticsummit.com/?utm_source=the-pragmatic-engineer&amp;utm_medium=newsletter&amp;utm_campaign=nov-20-paid-edition" rel="noopener noreferrer nofollow"><strong>Pragmatic Summit</strong></a>. Seats are limited, and tickets are priced at $499, covering the venue, meals, and production&#x2014;we&#x2019;re not aiming to make any profit from this event.</p><p><a href="https://www.pragmaticsummit.com/?ref=blog.pragmaticengineer.com">Apply to attend the Summit</a></p><p>I hope to see many of you there!</p><hr><h2 id="cloudflare-takes-down-half-the-internet-%E2%80%93-but-shares-a-great-postmortem">Cloudflare takes down half the internet &#x2013; but shares a great postmortem</h2><p>On Tuesday came another reminder about how much of the internet depends on Cloudflare&#x2019;s content delivery network (CDN), when thousands of sites went fully or partially offline in an outage that lasted 6 hours. Some of the higher-profile victims included:</p><ul><li>ChatGPT and Claude</li><li>Canva, Dropbox, Spotify,</li><li>Uber, Coinbase, Zoom</li><li>X and Reddit</li></ul><p>Separately, you may or may not recall that during a different recent outage caused by AWS, Elon Musk noted on his website, X, that AWS is a hard dependency for Signal, meaning an AWS outage could take down the secure messaging service at any moment. In response, a dev pointed out that it is the same for X with Cloudflare &#x2013; and so it proved earlier this week, when X was broken by the Cloudflare outage.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!IN2n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9cfc94-1792-4a5e-8fb6-c1815df54ff0_1072x898.png" class="kg-image" alt loading="lazy" width="1072" height="898"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Predicting the future. Source: Mehul Mohan </em></i><a href="https://x.com/mehulmpt/status/1980382080602370144?s=20&amp;ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>That AWS outage was in the company&#x2019;s us-east-1 region and <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">took down a good part of the internet</a> last month. AWS released incident details three days later &#x2013; unusually speedy for the e-commerce giant &#x2013; although that postmortem was high-level and we never learned <em>exactly</em> what caused AWS&#x2019;s <a href="https://newsletter.pragmaticengineer.com/i/176934094/how-dynamodb-dns-management-happens?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">DNS Enactor</a> service to slow down, triggering an unexpected race condition that kicked off the outage.</p><h3 id="what-happened-this-time-with-cloudflare">What happened this time with Cloudflare?</h3><p>Within hours of mitigating the outage, Cloudflare&#x2019;s CEO Matthew Prince shared an <a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">unusually detailed report </a>of what exactly went wrong. The root cause was to do with propagating a configuration file to Cloudflare&#x2019;s Bot Management module. The file crashed Bot Management, which took Cloudflare&#x2019;s proxy functionality offline.</p><p>Here&#x2019;s a brief overview of how Cloudflare&#x2019;s proxy layer works at a high level. It&#x2019;s the layer that protects the &#x201C;origin&#x201D; resources of customers &#x2013; minimizing network traffic to them by blocking malicious requests and caching static resources in Cloudflare&#x2019;s CDN:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!esOT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F132ad7a8-2c1d-4be1-8174-295941979ceb_1420x1312.png" class="kg-image" alt loading="lazy" width="1420" height="1312"><figcaption><i><em class="italic" style="white-space: pre-wrap;">How Cloudflare&#x2019;s proxy works. More details on </em></i><a href="https://blog.cloudflare.com/20-percent-internet-upgrade/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare&#x2019;s engineering blog</em></i></a></figcaption></figure><p>Here&#x2019;s how the incident unfolded:</p><p><strong>A database permissions change in </strong><a href="https://en.wikipedia.org/wiki/ClickHouse?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>ClickHouse</strong></a><strong> kicked things off. </strong>Before the permissions changed, all queries to fetch feature metadata (to be used by the Bot Management module) would have only been run on distributed tables in Clickhouse, in a database called &#x201C;default&#x201D; which contains 60 features.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!NEwO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f62c0a-5772-45a3-9be1-24e7c15c4e7b_1264x264.png" class="kg-image" alt loading="lazy" width="1264" height="264"><figcaption><span style="white-space: pre-wrap;">Before the permissions change: about 60 features were returned, that were fed to the Bot Module</span></figcaption></figure><p>Until now, these queries were running using a shared system account. Cloudflare&#x2019;s engineering team wanted to improve system security and reliability, and move from this shared system account to individual user accounts. User accounts already had access to another database called &#x201C;r0&#x201D;, so the team made the database permission change for access to r0 to be <em>implicit</em> instead of explicit.</p><p>As a side effect of this, the same query collecting the features to be passed to Bot Management started to fetch from the r0 database, and return many more features than expected:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!p5bm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e62f91e-7078-4b9d-8e2f-3b3fb357aef5_1220x252.png" class="kg-image" alt loading="lazy" width="1220" height="252"><figcaption><span style="white-space: pre-wrap;">After the permissions change: the query did not change but returned twice as many results</span></figcaption></figure><p><strong>The Bot Management module does not allow loading of more than 200 features. </strong>This limit was well above the production usage of 60, and was put in place for performance reasons: the Bot Management module pre-allocates memory for up to 200 features, and it will not operate with more than this number.</p><p><strong>A </strong><a href="https://en.wikipedia.org/wiki/Kernel_panic?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>system panic</strong></a><strong> hit machines served with the incorrect feature file. </strong>Cloudflare was nice enough to share the exact code that caused this panic, which was this unwrap() function:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!qih4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8462b639-2c4c-4c8d-91b2-a468f97d7ee4_1606x666.png" class="kg-image" alt loading="lazy" width="1456" height="604"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Source: </em></i><a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare</em></i></a></figcaption></figure><p>What likely happened:</p><ul><li>The append_with_names() function likely checked for a limit of 200 features</li><li>If it saw more than 200 features, it likely returned an error</li><li>&#x2026; and when writing the code, it was not expected that append_with_names() would return an error&#x2026;</li><li>&#x2026; and so .unwrap() panicked and crashed the system!</li></ul><p><strong>Edge nodes started to crash, one by one, seemingly randomly. </strong>The feature file was being generated every 5 minutes, and gradually rolled out to Edge nodes. So, initially, it was only a few nodes that crashed, and then over time, more became non-responsive. At one point, both good and bad configuration files were being distributed, making failed nodes that received the good configuration file start working &#x2013; for a while!</p><h3 id="why-so-long-to-find-the-root-cause">Why so long to find the root cause?</h3><p>It took Cloudflare engineers unusually long &#x2013; 2.5 hours! &#x2013; to figure all this out, and that an incorrect configuration file propagating to Edge servers was to blame for their proxy going down. Turns out, an unrelated failure made the Cloudflare team suspect that they were under a coordinated botnet attack, as when a few of the Edge nodes started to go offline, the company&#x2019;s status page did, too:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!Xa8F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F565ff3fa-112f-4500-940a-4f3f241991fd_1999x478.png" class="kg-image" alt loading="lazy" width="1456" height="348"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare&#x2019;s status page went offline when the outage started. Source: </em></i><a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare</em></i></a></figcaption></figure><p>The team tried to gather details about the attack, but there was no attack, meaning they wasted time looking in the wrong place. In reality, the status page going down was a coincidence and unrelated to the outage. But it&#x2019;s easy to see why their first reaction was to figure out if there was a distributed denial of service (DDoS) attack.</p><p>As mentioned, it eventually took 2.5 hours to pinpoint the incorrect configuration files as the source of the outage, and another hour to stop the propagation of new files, and create a new and correct file, which was deployed 3.5 hours after the start of the incident. Cleanup took another 2.5 hours, and at 17:06 UTC, the outage was resolved, ~6 hours after it started.</p><p>Cloudflare shared a detailed review of the incident and learnings, which can be <a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">read here.</a></p><h3 id="how-did-the-postmortem-come-so-fast">How did the postmortem come so fast?</h3><p>One thing that keeps being surprising about Cloudflare is how they have a very detailed postmortem up in less than 24 hours after the incident is resolved. Cofounfer and CEO Matthew Prince <a href="https://news.ycombinator.com/user?id=eastdakota&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">explained</a> how this was possible:</p><ul><li>Matthew was part of the outage call.</li><li>After the outage was resolved, he wrote a first version of the incident review, at home. Matthew was in Lisbon, in Cloudflare&#x2019;s European HQ, so this was early evening</li><li>The team circulated a Google Doc with this initial writeup, and questions that needed to be reviewed</li><li>In a few hours, all questions were answered</li><li>Matthew: &#x201C;None of us were happy [about the incident] &#x2014; we were embarrassed by what had happened &#x2014; but we declared it [the postmortem] true and accurate.</li><li>Sent the draft over to the SF team, who did one more sweep, the posted it</li></ul><p>Talk about moving with the speed of a startup, despite being a publicly traded company!</p><h3 id="learnings">Learnings</h3><p>There is much to learn from this incident, such as:</p><p><strong>Be explicit about logging errors when you raise them! </strong>Cloudflare could probably have identified the root cause of this error much faster if the line of code that returned an error, also logged the error, and if Cloudflare had alerts set up when certain errors spiked on its nodes. It could have surely shaved an hour or two off the time it took to mitigate.</p><p>Of course, logging errors before throwing them is extra work, but when done with monitoring or log analysis, it can help find the source of errors much faster.</p><p><strong>Global database changes are always risky. </strong>You never know what part of the system you might hit.<strong> </strong>The incident started with a seemingly innocuous database permissions change that impacted a wide range of queries. Unfortunately, there is no good way to test the impact of such changes (if you know one, please leave a comment below!)</p><p>Cloudflare was making the right kind of change by removing global systems accounts; it&#x2019;s a good direction to go in for security and reliability. It was extremely hard to predict the change would end up taking down a part of their system &#x2013; and the web.</p><p><strong>Two things going wrong at the same time can really throw an engineering team. </strong>If Cloudflare&#x2019;s status page did not go offline, the engineering team would have surely pinpointed the problem much faster than they did. But in the heat of the moment, it&#x2019;s easy to assume that two small outages are connected, until there&#x2019;s evidence that they&#x2019;re not. Cloudflare is a service that&#x2019;s continuously under attack, so the engineering team can&#x2019;t be blamed for assuming it might be more of the same.</p><p><strong>CDNs are the backbone of the internet, and this outage doesn&#x2019;t change that. </strong>The outage hit lots of large businesses, resulting in lost revenue for many. But could affected companies have prepared better for Cloudflare going down?</p><p>The problem is that this is hard: using a CDN means taking on a <em>hard</em> dependency in order to reduce traffic on your own servers (the origin servers), while serving internet users faster and more cheaply:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!54wJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dca2f86-18b2-4ba8-8fd2-bc7236b330db_1194x280.png" class="kg-image" alt loading="lazy" width="1194" height="280"><figcaption><span style="white-space: pre-wrap;">A CDN is a common way to reduce traffic to servers and serve webpages and APIs faster to users</span></figcaption></figure><p>When using a CDN, you propagate addresses that point to that CDN server&#x2019;s IP or domain. When the CDN goes down, you could start to redirect traffic to your own origin servers (and deal with the traffic spike), or utilize a backup CDN, if you prepared for this eventuality.</p><figure class="kg-card kg-image-card"><img src="https://substackcdn.com/image/fetch/$s_!fj68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ef266a-4a28-429b-9d01-52a34e03eae0_1248x774.png" class="kg-image" alt loading="lazy" width="1248" height="774"></figure><p>Both these are expensive to pull off:</p><ul><li>Redirecting to the origin servers likely means needing to suddenly scale up backend infrastructure</li><li>Having a backup CDN means there must be a contract and payment for a CDN partner which will most likely sit idle. As and when it is needed, you must switch over and warm up their cache: it&#x2019;s a lot of effort and money to do this!</li></ul><p>A case study in the trickiness of dealing with a CDN going offline is the story of Downdetector, including inside details on why Downdetector went down during Cloudflare&#x2019;s latest outage, and what they learned from it.</p><hr><p><em>This was one out of the five topics covered in this week&#x2019;s The Pulse. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The full edition</em></a><em> additionally covers:</em></p><ol><li><strong>Downdetector &amp; the real cost of no upstream dependencies.</strong> During the Cloudflare outage, Downdetector was also unavailable. I got details from the team about why they have a hard dependency on Cloudflare, and why that won&#x2019;t change anytime soon.</li><li><strong>Antigravity: Google&#x2019;s new AI IDE &#x2013; that its devs cannot use. </strong>Google wants to become a serious player in AI coding tools, but Antigravity contains remnants of Windsurf. Interestingly, devs at Google aren&#x2019;t allowed to use Antigravity for work</li><li><strong>Industry pulse.</strong> Gemini 3 launch, Anthropic valued at $350B, Jeff Bezos funds an AI company, and unusually slow headcount growth at startups persists.</li><li><strong>Five AI fakers caught in 1 month by crypto startup. </strong>Candidates who fake their backgrounds and change their looks in remote interviews continue to plague companies hiring full-remote &#x2013; especially crypto startups.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Read the full The Pulse</strong></a></p>]]></content:encoded></item><item><title><![CDATA[Four years on writing a tech book: pitching to a publisher]]></title><description><![CDATA[<p>In 2019, I decided to write a book about software engineering. As an experienced software engineer and manager, I had the topic clear in my head, and assumed the whole project would take between six and 12 months in writing and publishing it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png" class="kg-image" alt loading="lazy" width="1456" height="866" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The first proof copy of</span><a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;"> The Software</span></a></figcaption></figure>]]></description><link>https://blog.pragmaticengineer.com/four-years-on-writing-a-tech-book-pitching-to-a-publisher/</link><guid isPermaLink="false">69130a5abb6a4e00013466cc</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Tue, 11 Nov 2025 10:45:43 GMT</pubDate><content:encoded><![CDATA[<p>In 2019, I decided to write a book about software engineering. As an experienced software engineer and manager, I had the topic clear in my head, and assumed the whole project would take between six and 12 months in writing and publishing it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png" class="kg-image" alt loading="lazy" width="1456" height="866" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The first proof copy of</span><a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;"> The Software Engineer&#x2019;s Guidebook</span></a><span style="white-space: pre-wrap;"> &#x2013; hence the &#x201C;not for resale&#x201D; markup</span></figcaption></figure><p>In the end, this process took several times longer; 4 years, in fact! Happily, it was worth it: readers&#x2019; feedback about <a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com"><strong><u>The Software Engineer&#x2019;s Guidebook</u></strong></a> has been overwhelmingly positive, and on launch, the book became a <a href="https://twitter.com/GergelyOrosz/status/1723205530481729838?ref=blog.pragmaticengineer.com"><u>#1 bestseller</u></a> among all titles in two Amazon markets (the Netherlands and Poland), as well as a top 100-selling book in most Amazon markets. In 24 months it sold around 40,000 copies, and was translated into <a href="https://learning.oreilly.com/library/view/guidebook-fur-software/9783960092513/?ref=blog.pragmaticengineer.com"><u>German</u></a>,<a href="https://www.hanbit.co.kr/store/books/look.php?p_code=B2570473158&amp;ref=blog.pragmaticengineer.com"> <u>Korean</u></a>, <a href="https://x.com/GergelyOrosz/status/1936044091009036690?ref=blog.pragmaticengineer.com"><u>Mongolian</u></a> and<a href="https://x.com/GergelyOrosz/status/1973632590541365384?ref=blog.pragmaticengineer.com"> <u>Traditional Chines</u></a> &#x2013; with the Japanese and simplified Chinese versions releasing later this month.</p><p>A lot of people ask why I chose to self publish, and it would be nice to say this was always the goal, but it wasn&#x2019;t! Originally, I wanted to work with a top tech publisher, who would get the book to market fast, and give it a higher profile. This didn&#x2019;t happen, but during the process I learned a lot about how publishing works, how to pitch a book, and how to choose which publishing route might be the right one.&#xA0;</p><p>This article shares my learnings from writing and publishing a book which has done pretty well with readers, and it includes the experience working with an established publishing house:</p><ol><li>Tech book publishing landscape</li><li>Financials of publishing</li><li>Publishing process and the publisher&#x2019;s role</li><li>My book pitch</li><li>Working with a publisher</li><li>Breaking up with a publisher</li></ol><h2 id="1-tech-book-publishing-landscape">1. Tech book publishing landscape</h2><p>Today, there are reputable book publishers whose titles are good and authoritative, and there are other publishers whom this doesn&#x2019;t apply to. Each publisher also has a subject area: some are mainstream and publish titles about every software engineering area from languages to engineering management. Meanwhile, others stick to a topic of expertise they focus on.</p><p>Here&#x2019;s my mental model of the book publishing industry in 2025:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png" class="kg-image" alt loading="lazy" width="1600" height="1356" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 1600w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Biggest players in the tech book publishing industry, a subjective mental model of course!</em></i></figcaption></figure><h4 id="highly-reputable-mainstream-publishers">Highly reputable mainstream publishers</h4><p>In tech book publishing, three publishing houses really stand out, in my opinion, and form a &#x2018;big three&#x2019; among all players in this sector:&#xA0;</p><ul><li><a href="https://oreilly.com/?ref=blog.pragmaticengineer.com"><strong><u>O&#x2019;Reilly</u></strong></a>: if I had to pick a #1 tech book publisher, it would be O&#x2019;Reilly. They publish some of the most referenced books &#x2013; like Designing Data Intensive Applications by Martin Kleppmann, <a href="https://newsletter.pragmaticengineer.com/p/dead-code-getting-untangled-and-coupling?ref=blog.pragmaticengineer.com"><u>Tidy First</u></a> by Kent Beck, <a href="https://newsletter.pragmaticengineer.com/p/the-staff-engineers-path?ref=blog.pragmaticengineer.com"><u>The Staff Engineer&#x2019;s Path</u></a> by Tanya Reilly, and more. The book covers are distinctive, using images of animals.</li><li><a href="https://www.manning.com/?ref=blog.pragmaticengineer.com"><strong><u>Manning</u></strong></a>: a broad range of titles on both specific and general tech topics, which employ historical figures on the covers.</li><li><a href="https://pragprog.com/?ref=blog.pragmaticengineer.com"><strong><u>The Pragmatic Bookshelf</u></strong></a>: also referred to as the &#x201C;Prags.&#x201D; Founded by Andy Hunt and Dave Thomas, the authors of what might be the best-selling tech book ever; The Pragmatic Programmer. Since its founding, The Prags has refused digital rights management (DRM) on their ebooks.</li></ul><h4 id="high-reputable-%E2%80%9Cmainstream%E2%80%9D-publishers-that-are-tough-to-pitch-to">High reputable &#x201C;mainstream&#x201D; publishers that are tough to pitch to</h4><p>The publishers in this section have strong reputations, like those above. However, they are harder to pitch to, usually because they publish fewer tech books. I couldn&#x2019;t find an author pitch template, or clear pitching instructions, and contributes to a sense of &#x201C;don&#x2019;t find us, we&#x2019;ll find you&#x201D; among the following publishing houses:&#xA0;</p><ul><li><a href="https://en.wikipedia.org/wiki/Addison-Wesley?ref=blog.pragmaticengineer.com"><strong><u>Addison-Wesley:</u></strong></a> one of the best-known brands in tech. It has been an imprint (a trade name within a publication) of Pearson since 1988, and is the publisher of many &#x201C;classic&#x201D; book titles like Clean Code by Robert C. Martin, The Pragmatic Programmer by Andy Hunt and Dave Thomas, and some recent ones like Modern Software Engineering by Dave Farley. I couldn&#x2019;t find any way to pitch to this publisher, and new books they publish seem to be by established authors.</li><li><a href="https://www.pearson.com/?ref=blog.pragmaticengineer.com"><strong><u>Pearson</u></strong></a>: This business owns the Addison-Wesley imprint. Recently, it started to publish tech books as &#x201C;Pearson&#x201D; instead, author Martin Fowler <a href="https://twitter.com/martinfowler/status/1766836423808766003?ref=blog.pragmaticengineer.com"><u>shared</u></a>.</li><li><a href="https://www.wiley.com/en-us?ref=blog.pragmaticengineer.com"><strong><u>Wiley</u></strong></a>: formerly a well-known tech book publisher behind the &#x201C;X for Dummies&#x201D; series. It publishes lots of <a href="https://www.wiley.com/en-nl/etextbooks-and-courseware/computer-science-and-technology?ref=blog.pragmaticengineer.com"><u>computer science textbooks</u></a>, but I can&#x2019;t find recently-published, well-known <em>tech books </em>for software engineers.</li><li><a href="https://www.springer.com/gp?ref=blog.pragmaticengineer.com"><strong><u>Springer</u></strong></a>: another massive publisher for whom tech books are a small part of the business. I couldn&#x2019;t find how to pitch tech books to them.</li><li><a href="https://booksite.mkp.com/?ref=blog.pragmaticengineer.com"><strong><u>Morgan Kaufmann</u></strong></a>: a well-known tech books publisher founded in 1984, and acquired in 2001 by Elsevier. As I understand, these days it prints far fewer technology book, and focuses on academic topics. No clear way to pitch to them.</li></ul><h4 id="highly-reputable-%E2%80%9Cniche%E2%80%9D-publishers">Highly reputable &#x201C;niche&#x201D; publishers</h4><p>The following publishers are standout in quality, covering fewer topics than those above.</p><ul><li><a href="https://nostarch.com/?ref=blog.pragmaticengineer.com"><strong><u>No Starch Press</u></strong></a>: &#x201C;The finest in geek entertainment&#x201D; is the tagline, featuring fun visuals, and high-quality content on specific technologies like machine learning, Python, JavaScript, etc.</li><li><a href="https://itrevolution.com/?ref=blog.pragmaticengineer.com"><strong><u>IT Revolution</u></strong></a>: titles for technology leaders: DevOps, technology delivery, workplace culture, and similar. Publisher of The Phoenix Project, Team Topologies, and Accelerate.</li><li><a href="https://www.artima.com/books?ref=blog.pragmaticengineer.com"><strong><u>Artima</u></strong></a>: focuses on Scala.</li><li><a href="https://www.routledge.com/go/crc-press?ref=blog.pragmaticengineer.com"><strong><u>CRC Press</u></strong></a>: publishes on technology, engineering, math, and medicine.</li><li><a href="https://press.stripe.com/?ref=blog.pragmaticengineer.com"><strong><u>Stripe Press</u></strong></a>: &#x201C;works about technological, economic, and scientific advancement.&#x201D;</li><li><a href="https://mitpress.mit.edu/?ref=blog.pragmaticengineer.com"><strong><u>MIT Press</u></strong></a>: &#x201C;a distinctive collection of influential books curated for scholars and libraries worldwide.&#x201D;</li></ul><h4 id="other-mainstream-book-publishers">Other mainstream book publishers</h4><p><a href="https://www.apress.com/?ref=blog.pragmaticengineer.com"><strong><u>Apress</u></strong></a> is a reputable publisher with a lower profile, which publishes on a wide range of topics, from specific technologies and frameworks, to more generic topics on computing. Because they publish many books on many topics, they are usually open to pitches.</p><p><a href="https://www.packtpub.com/?ref=blog.pragmaticengineer.com"><strong><u>Packt</u></strong></a>. A tech book publisher with a focus on quantity over quality, it feels to me. There is limited support and feedback for authors, and titles could often use more editing. But also, Packt is likely to say &#x201C;yes&#x201D; to a serious proposal.</p><h2 id="2-financials-of-publishing">2. Financials of publishing</h2><p>Financial matters really come into play when your proposal is accepted by a publisher and you receive a contract offer.</p><p><strong>Advance: $2,000 &#x2013; $5,000. </strong>An advance payment to the writer is a tried and tested way to make them deliver a completed manuscript. It&#x2019;s often paid in chunks: 50% when a milestone is hit, and 50% when a full draft appears.</p><p>The &#x201C;big three&#x201D; publishers typically offer $5,000, usually as a flat, non-negotiable rate; at least, it&#x2019;s what I was offered. Smaller publishers offer closer to $2,000 for more niche books. The advance is non-refundable; even if your book sells zero copies, you keep it. The publisher is making an investment in you, and taking a risk.</p><p><em>As an aside: if you are thinking of writing a book: for guest authors in The Pragmatic Engineer Newsletter guest authors I offer a $4,000 per article payment &#x2013; and you can later publish your guest article in a book. Several authors working on their book have written a guest articles such as Lou Franco on </em><a href="https://newsletter.pragmaticengineer.com/p/paying-down-tech-debt?ref=blog.pragmaticengineer.com"><em><u>Paying down tech debt</u></em></a><em> or Apurva Chitnis on </em><a href="https://newsletter.pragmaticengineer.com/p/thriving-as-a-founding-engineer?ref=blog.pragmaticengineer.com"><em><u>Thriving as a founding engineer</u></em></a><em>. Writing a guest post can help refine ideas, broaden your reach, and prove helpful when publishing the article.</em></p><h4 id="paperback-royalty-7-15"><strong>Paperback royalty:</strong> 7-15%&#xA0;</h4><p>Royalties are earned on book sales, and taken from the net price of the book. Net price is what a publisher gets after the retailer (e.g. Amazon, or a bookshop) takes their cut. Let&#x2019;s see how it works for a $40 book:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.11.21.png" class="kg-image" alt loading="lazy" width="1396" height="1080" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/Screenshot-2025-11-11-at-11.11.21.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/Screenshot-2025-11-11-at-11.11.21.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.11.21.png 1396w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The royalty from a $40 book that has a 10% royalty can be anywhere from $4 to around $1.80, depending on the channel it was sold on. It all depends on how much revenue the publisher received after the sale.</em></i></figcaption></figure><p>It matters financially where your title is purchased; be it an online shop, physical book store, or purchased directly from the publisher. Many tech books are sold on Amazon and online stores. Amazon&#x2019;s 40% cut seems high, but it&#x2019;s actually the lowest among book retailers. Up to 60% is a common cut for a physical bookshop.</p><p>Most publishers offer 10-12.5% royalties, is my understanding, and Packt around 15-20%. Keep in mind that brand reputation plays a role; for example, Packt&#x2019;s reputation is less elevated than Manning, which can make a difference to sales.</p><h4 id="ebook-royalties-10-25">Ebook royalties: 10-25%</h4><p>For ebooks, several publishers pay 25% royalties, but not all. But even with a higher royalty rate, an author might end up making less per sale. For example, on the Kindle platform, the cut for Amazon is high at 65%. Let&#x2019;s look at a $30 ebook with a 20% royalty rate:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.12.30.png" class="kg-image" alt loading="lazy" width="1412" height="914" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/Screenshot-2025-11-11-at-11.12.30.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/Screenshot-2025-11-11-at-11.12.30.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.12.30.png 1412w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">ebooks are cheaper, but authors can earn more with this royalty structure. Selling on Kindle version is the least profitable because Amazon takes 65% of any sale above $10</em></i></figcaption></figure><p>Ebooks are almost always priced lower than physical books, and when sold on Kindle, generate much less revenue for the author, while earning more per copy than the paperback version. <em>I was offered 10% royalties on ebook sales, which is at the low end.</em></p><h4 id="%E2%80%9Cearning-out%E2%80%9D">&#x201C;Earning out&#x201D;&#xA0;</h4><p>When an author needs to pay back an advance before being paid anything, this is called &#x201C;earning out&#x201D;. If you get a $5,000 advance for a title costing $40 per hard copy and $25 for the ebook version, and most sales happen on Amazon, it means:</p><ul><li>~2,080 paperback sales on Amazon</li><li>Or ~2,850 Kindle book sales</li><li>Or ~1,250 paperback sales on the publisher website</li></ul><p>The author needs to sell at least 1,000 copies across various platforms to &#x201C;earn out.&#x201D; The good news is that a publisher sends quarterly or annual royalty payments if a book keeps generating revenue, which would effectively be passive income.</p><h4 id="the-prags%E2%80%99-unique-approach">The Prags&#x2019; unique approach</h4><p>One publisher that calculates rates differently is The Pragmatic Bookshelf. Instead of offering a low-digit number on <em>revenue</em>, they offer a 50% split on <em>profit</em>.</p><p>50% on profit sounds much higher than 10% on revenue, right? However, the devil is in the details, because paying on profit means that the upfront publisher costs &#x2013; editors, cover design, printing, distribution, marketing &#x2013; all are deducted before any profit split.</p><p>Authors who have used this approach tell me the numbers end up pretty similar to the revenue model.</p><h4 id="real-world-case-studies-with-actual-earnings">Real-world case studies with actual earnings</h4><p><a href="https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html?ref=blog.pragmaticengineer.com"><u>Designing Data Intensive Applications</u></a> author, Martin Kleppmann, shared the cumulative royalties he made in 6 years. The breakdown is interesting; ebook and Safari Online sales generated more revenue for the writer than the print version.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png" class="kg-image" alt loading="lazy" width="1400" height="800" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 1400w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cumulative royalties for Designing Data Intensive Applications, published by O&#x2019;Reilly. Image source: </em></i><a href="https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html?ref=blog.pragmaticengineer.com"><u><i><em class="italic underline" style="white-space: pre-wrap;">Martin Kleppman&#x2019;s site</em></i></u></a></figcaption></figure><p><a href="https://rothgar.medium.com/the-economics-of-writing-a-technical-book-689d0c12fe39?ref=blog.pragmaticengineer.com"><u>Cloud Native Infrastructure earnings</u></a>: author Justin Garrison published with O&#x2019;Reilly, and was offered 10% for print and 25% for ebooks (split into half, thanks to working with a coauthor). His book sold 1,337 copies in 4 months; and made about $22,000 for the two authors (and around $11,000 for Justin.) Justin concluded:</p><blockquote>&#x201C;Going into this project I had a rough estimate in my head to make about $2000&#x2013;3000 so this is much better than I expected. Set your expectations accordingly.&#x201D;</blockquote><p><strong>Don&#x2019;t forget that publishers are also in this to make a positive return.</strong> This means that it is unlikely for a highly reputable publisher to invest into a book that they do not believe would sell at least a few thousand copies. I don&#x2019;t have the data here: but if I was a publisher, I would reject any book that didn&#x2019;t look like it could hit 1,000 copies sold in the first year of publishing.</p><h2 id="3-the-publishing-process-and-publisher-roles">3. The publishing process, and publisher roles</h2><p>Why does a publisher take so much of the revenue? Part of this is because they do a lot of the work around publishing, and need to hire (and pay!) people for those roles. Here is my understanding of how the publishing process works, based on four months of pitching to publishers; two months of working with one of them; and researching how the rest of the process works:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png" class="kg-image" alt loading="lazy" width="1320" height="1568" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 1320w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">My understanding of the publishing process, when working with a publisher. You probably get to work with quite a few specialized folks!</em></i></figcaption></figure><p>Here are people I worked with, and my experience with them:</p><p><strong>The acquisitions editor. </strong>If you write a technical blog, you might get a reachout from someone called an acquisitions editor, who will ask if you would consider publishing a book. Also, when you submit a pitch to a publisher, chances are that you will first communicate with an acquisitions editor.</p><p>A publisher&#x2019;s goal is to publish books that will be profitable for them. They find authors who could write these books two ways:</p><ul><li>Inbound pitches coming from authors &#x2013; reviewed by editors or acquisitions editors</li><li>External reachouts done by acquisitions editors</li></ul><p>These people need to have a good understanding of what kinds of books sell well at the publisher (and why); what their current catalogue is; what the gaps are; and what competitor publishers are commissioning.</p><p>When I pitched my book to 3 respected publishers, in two cases I talked with (and worked with) the acquisitions editor to improve my pitch. The acquisitions editors were my &#x201C;champions&#x201D; at the publisher. Their goal was to get a pitch that the company <em>would</em> say yes to.</p><p><strong>The development editor </strong>works on the <em>structure</em> of the book. They ask the author to come up with a detailed table of contents &#x2013; in my case, they asked me to estimate even the length of the chapters. They also help develop &#x2013; and maintain &#x2013; the narrative of the book.</p><p>Had I not worked with a publisher, I would have had no appreciation of this &#x201C;high-level editing&#x201D; &#x2013; which, turns out, is key for writing a well-structured tech book!</p><p><strong>The project manager</strong> checks in with timelines, organizes reviews&#x2014;like editorial reviews&#x2014;and helps keep you accountable. One of the best things about working with a publisher is that you are on a tight deadline&#x2014;without which it would take you several times longer to publish the book!</p><p><strong>The publisher owns a lot of rights for your book! </strong>One thing that I realized only after signing with a publisher is that while publishers help a lot with writing the book &#x2013; and taking a higher cut is sensible because of this &#x2013; they also hold on to a lot of rights that impact your book! These are all things that you give up on, versus when self-publishing. These are:</p><ul><li><strong>Global publishing rights</strong>. Although you are the author of the book &#x2013; and usually hold the copyright to it &#x2013; the publisher own wordlwide publishing rights. This means that they are the only ones who can publish the book, or longer excerpts of it. In practice, this means you need to get permission if you&#x2019;d like to publish some parts of your book on e.g. your blog, or social media. <em>They&#x2019;ll usually grant this as it&#x2019;s good marketing &#x2013; but it&#x2019;s still that you need to ask, as the author.</em></li><li><strong>Foreign rights. </strong>The publisher will own the publishing right, and will usually be the one who owns selling foreign rights. In theory, this could sound like you are losing out on things. In pratice, publishers are much better positioned to sell and administer these rights. Most publishers offer a 50% cut on these rights &#x2013; it&#x2019;s what my publisher offered. <em>Also, the majority of tech books are not translated to other languages &#x2013; a book that &#x201C;only&#x201D; sells 2,000 copies in English is unlikely to sell a significant number in a non-English market!</em></li><li><strong>The cover. </strong>The publisher decides what cover they will design, though they tend to check the author for feedback.</li><li><strong>The title.</strong> One of the surprises for me was how the publisher <em>ultimately</em> decides on the title and subtitle.</li></ul><p>In short: this book is owned by the publisher. You are the author, but they are the only ones who can distribute it. In practice, many authors would prefer to have it this way &#x2013; because all the work related to distributing the book is taken on by the publisher. However, it&#x2019;s good to know that you need to give up all the above when working with a publisher.</p><h2 id="4-my-book-pitch">4. My book pitch</h2><p>My secret hope, back in 2019, was to get a contract with one of the &#x201C;Big 3&#x201D; tech book publishers: O&#x2019;Reilly, Manning or The Prags. I pitched my book to all three: got a &#x201C;no&#x201D; from two, but a &#x201C;yes&#x201D; (and a contract) from one. Here&#x2019;s how I went about my pitch.</p><h4 id="write-a-%E2%80%9Cone-pager%E2%80%9D-about-your-book">Write a &#x201C;one-pager&#x201D; about your book</h4><p>What will this book be about? Who is it for? What will readers take away when reading it? Answer these in a short pitch, <em>before</em> even seeking out publishers. Here&#x2019;s what I put together as my &#x201C;one-pager:&#x201D;</p><h4 id="do-some-market-research">Do some market research</h4><p>What are similar books in the market that would be competing with this book, directly or indirectly? How is this book different from them?</p><p>What is the demographic of people who would be interested in buying this book? Can you estimate how large this crowd is? Realistically, what percentage of this group could be interested in buying the book &#x2013; assuming they know about it? <em>Don&#x2019;t forget that publishers will invest into books that can generate decent sales: it&#x2019;s good to do a little research to help confirm your title could be one of these!</em></p><h4 id="shortlist-publishers-you-would-be-interested-working-with">Shortlist publishers you would be interested working with</h4><p>While there are quite a few publishers out there: what are your top preferences? And what are ones you&#x2019;re willing to consider, even if your &#x201C;top&#x201D; choices turn you away?</p><p>Self-publishing is always an option (I&#x2019;ll cover more on how I went about this in later parts). However, going with a good publisher can significantly speed up your book production, while also improving the quality.</p><h4 id="write-a-draft-table-of-contents-and-a-draft-chapter">Write a draft table of contents and a draft chapter</h4><p>Some publishers will want to look at what a draft chapter will look like &#x2013; but not all of them. Still, I found it helpful to do writing before submitting to a publisher. If for no other reason, this was to confirm that I&#x2019;d enjoy longform writing!</p><p>I spent about a week putting together a table of content, and around four months writing drafts of chapters. These chapters turned out to be helpful later on.</p><h4 id="submit-a-tailored-pitch-your-the-publishers">Submit a tailored pitch your the publisher(s)</h4><p>Once you identified your top publisher choices, submit a pitch. Most book publishers have a pitch document they want you to follow. Here are common ones:</p><p><a href="https://www.oreilly.com/work-with-us.html?ref=blog.pragmaticengineer.com"><u>O&#x2019;Reilly&#x2019;s pitch template:</u></a></p><ul><li>Description</li><li>About the topic</li><li>Audience</li><li>Keywords</li><li>Competing titles</li><li>Related O&#x2019;Reilly titles</li><li>Book outline</li><li>Writing schedule</li></ul><p><a href="https://www.manning.com/write-for-us?ref=blog.pragmaticengineer.com"><u>Manning&#x2019;s pitch template:</u></a></p><ul><li>About the author</li><li>About the book topic</li><li>The book plan</li><li>Q&amp;A</li><li>Reader overview</li><li>Book competition</li><li>Book length and illustrations</li><li>Writing schedule</li><li>Table of contents</li></ul><p><a href="https://pragprog.com/publish-with-us/resources/PragProg_Proposal_Template.txt?ref=blog.pragmaticengineer.com"><u>The Pragmatic Bookshelf template:</u></a></p><ul><li>Overview</li><li>Outline</li><li>Bio</li><li>Competing books</li><li>PragProg books</li><li>Market size</li><li>Promotional ideas</li><li>Writing samples</li></ul><p>Most of these templates ask for similar content, so if you completed one pitch: the others are much easier. Here are some tips I&#x2019;d have for building a pitch.</p><p><strong>Put yourself in the shoes of the publisher. </strong>This book is a <em>huge</em> deal to you: but it&#x2019;s just one of the dozens that the publisher will publish <em>just</em> this year. You want to write an <em>amazing</em> book: but the publisher wants to publish one that <em>will sell</em>.</p><p>And these are major differences! The publisher will care very much about competition for the book, and how their existing titles relate to them. Like a VC firm, a publisher will not want to fund two investments competing on the exact same market: so if the publisher recently published a book that is a deepdive on Go; they will almost certainly pass on the next one, no matter how good your pitch is.</p><p><strong>Pitching to several publishers parallel is totally fine and you should do it! </strong>This is one thing I wish I&#x2019;d done differently.<strong> </strong>In my mind, I was 100% certain that my first publisher-of-choice would jump on the opportunity to publish this book. I thus felt that it would be &#x201C;unfair&#x201D; if I pitched to other publishers, without hearing back.</p><p>In hindsight, as a first-time author, this strategy was a waste of time on my end. Most publishers are unlikely to take a risk on a first-time author with no books published in the past &#x2013; like I was in 2019. And so the likely outcome is rejection in most cases.</p><p>In my case, I spent about two and a half months waiting on the response from this first publisher. My acquisitions editor was championing the book &#x2013; making the case for the publisher to offer a contract &#x2013; but in the end, the publisher chose another book with a similar topic that was in their pipeline. This made perfect business sense for them &#x2013; but for me, I was spent waiting for months, instead of pitching the book to other publishers!</p><p><strong>My book pitch ended up being a helpful resource on my self-publishing journey. </strong>Even though I did not release with a publisher: pitching to publishers helped the book become an eventual success. It was for these reasons:</p><ul><li><strong>Defining the structure.</strong> I had my table of contents well thought-out by the time I submitted the pitch. This structure changed later, but it was a solid start.</li><li><strong>Positioning the book.</strong> I had a good idea of the &#x201C;competitive&#x201D; landscape, and what books my title would &#x201C;go up against.&#x201D; It also helped me focus on how my book is different to what is already out there.</li><li><strong>Forcing me to think about marketing. </strong>The Pragmatic Bookshelf asked for a section on promotional ideas. This forced me to think about where (and how) I would promote the book &#x2013; even before getting into the thick of writing. When going with a publisher, it&#x2019;s safe to assume that the publisher&#x2019;s brand will do some marketing. However, authors will still do the lion&#x2019;s share of marketing &#x2013; and it&#x2019;s good to think about this ahead of time.</li></ul><h2 id="5-working-with-a-publisher">5. Working with a publisher</h2><p>I got lucky with one of the three publishers, in the end. This publisher was looking for a book just like mine, right at that time! What happened was one of their best sellers had to be pulled from publication, for reasons outside the control of the publisher. Apparently, when my pitch arrived, they had just started a search for a book that could plug the hole &#x2013; and they saw my book being a perfect fit for a &#x201C;software career advice&#x201D; book.</p><p>At the time, this felt like great luck. In hindsight, my relationship with the publisher might have soured exactly because they were looking for me to write <em>a specific kind of book</em> that would be similar enough to this old book &#x2013; but I had no intention of doing so. <em>More on how things went sour in the section after this one.</em></p><p>From signing the contract, I worked with a publisher for about a month &#x2013; so I&#x2019;m not exactly the most experienced in this front. However, a couple of things stood out as strong positives &#x2013; and things that I &#x201C;lost&#x201D; when deciding to self publish, in the end.</p><p><strong>Strong pressure to write &#x2013; thanks to the contract. </strong>My contract had pretty strict deadlines included. We signed it on 11 January 2020, and these deadlines were part of the contract:</p><p>&#x201C;The Author shall prepare and deliver to the Publisher a machine-readable electronic copy of the manuscript for the Work, including all its illustrations, code listings, and exercises, as mutually agreed upon by the Publisher and the Author as follows:</p><p>- Not later than March 15, 2020, a partial manuscript for the Work totaling not less than one third of the planned finished Work.</p><p>- Not later than June 1, 2020, a partial manuscript for the Work totaling not less than two thirds of the planned finished Work.</p><p>- Not later than August 15, 2020, a draft of the complete manuscript for the Work suitable for review.</p><p>- Not later than September 1, 2020, the final, revised and complete manuscript for the Work acceptable to the Publisher for publication.&#x201D;</p><p>Talk about pressure! Also, my first payout was tied to reaching the first milestone &#x2013; which was delivering at least a third of the finished work. My publisher also set up regular check-ins to help me stay accountable. And this kind of pressure was good &#x2013; because without it, I would have pushed back writing, or got stuck on relatively trivial parts!&#xA0;</p><h2 id="6-breaking-up-with-the-publisher">6. Breaking up with the publisher</h2><p>While I greatly appreciated that a publisher took a chance on me, lots of things felt wrong from the start. A month into working together, I felt that things were getting worse, and not better.</p><p>The small things that I dismissed, in the beginning:</p><ul><li><strong>A (very) opinionated structure.</strong> This publisher had strongly opinionated templates I was told to use for all chapters. They included each chapter to start by stating what the reader will learn; and then summarize this at the end of the chapter. It wasn&#x2019;t how I imagined my book to be &#x2013; but it didn&#x2019;t seem I had a choice. I figured, I&#x2019;ll give it a go. The publisher knows better after all, as they&#x2019;ve done this hundreds of times. <em>Right</em>?</li><li><strong>Needing to ask for permission to share drafts on social media.</strong> I originally planned to share screenshots of some of the parts I am writing to get feedback as I go &#x2013; and to also increase visibility of the book. I thought that this is a no-brainer. Not only does this kind of &#x201C;early sharing&#x201D; makes the book better: but it will also make more people excited about the book, leading to more eventual customers. To my surprise, my contact at the publisher said I will need to ask for permission whether I can do this. Permission? For something that will market the book? Yes: because the publisher owns all publishing rights, including for the draft!</li><li><strong>I won&#x2019;t decide on what the title will be.</strong> I had strong opinions about what I&#x2019;d like the book&#x2019;s title to be. My publishing contact also had ideas on what they thought would be good to add to it &#x2013; like introducing the &#x201C;mentoring&#x201D; term either to the title or the subtitle: which was an idea I disliked. As I talked with them, it became clear that the publisher will set the final title: not me. Hmm &#x2013; odd, no? It&#x2019;s another reminder that, although it&#x2019;s my book: it&#x2019;s <em>really</em> the publisher&#x2019;s book, and they have the final say on all important decisions.</li><li><strong>Nudges to &#x201C;dumb down&#x201D; the book. </strong>My editor was giving more suggestions on how to edit the content to make it more &#x201C;beginner-friendly&#x201D; and suggested I introduce e.g. &#x201C;Alice and Bob&#x201D; examples to make it easier to digest the contents. <em>One of the recently best-selling books of the publisher heavily used Alice and Bob, and it seems the publisher thought it helped their sales.</em></li></ul><p><strong>The first major editorial review was where I decided we should part ways with the publisher. </strong>About a month-and-a-half in, the publisher pulled together several experienced editors, and offered suggestions on how I could improve the book. The suggestions were these:</p><ul><li><strong>Focus on reader engagement. </strong>Tell stories and develop them with emotion, mystery, aha moments, and unexpected conclusions. Tell the stories from the &quot;we&quot; or &quot;they&quot; perspective -- make stories team-oriented.</li><li><strong>Exercises.</strong> Develop exercises for use within the chapters (not just end) or a story about what happened when one person did the exercise.</li><li><strong>Mini-projects.</strong> Guide readers to discover and come to conclusions on their own (see Donald Saari story in What the Best College Teachers Do). Mini project topics: testing, architectures.</li><li><strong>Word of the day feature.</strong> Example: Dependency injection (what is it)? Scatter these across the book.</li><li><strong>Quotes.</strong> Include quotes from luminaries such as [Well-known-person 1] and [Well-known-person 2] that relate to advice given. Ask other [Publisher] authors to relate experience about how they followed similar advice and were successful.</li><li><strong>Tech map. </strong>Create a diagram of the current technology landscape. Example big-picture topics: architecture demystified, distributed systems demystified.</li></ul><p>While I appreciated the suggestions: I <em>hated</em> all of them. I saw what implementing them would do: they would turn this book &#x2013; which I already had reservations with the &#x201C;forced&#x201D; style on me &#x2013; to something I would <em>not</em> want to read. Much less write!</p><p>I envisioned writing a more matter-of-the-fact book that doesn&#x2019;t have exercises, &#x201C;mini projects&#x201D; or &#x201C;word of the day&#x201D; gimmicks.</p><p><strong>I sat down to reflect why I chose to work with a publisher, to start with.</strong> As an author, I&#x2019;m giving up a lot of things: editorial control, the bulk of revenue, all publishing rights&#x2026; and for what? For the publisher to make the process easier, and for the end result book to be better than if I was working alone.</p><p>But I felt that this book would be far <em>worse</em> if I continued with my publisher: and the only way to get it back to what I envisioned was if I spent a lot of time and energy pushing back on them.</p><p>It would cost me less energy to self-publish. So I decided to terminate my agreement because it didn&#x2019;t feel my publisher was helping write the book that I wanted to write.</p><p><strong>My publisher was understanding and professional in terminating the contract.</strong> I explained to them that all the feedback suggested they wanted to see a very different book to what I wanted to write. And that, frankly, I am not the author to write <em>that</em> kind of book.</p><p>Truth be told, I was embarrassed that I had wasted their resources &#x2013; working with their development editor and the editing team &#x2013; for these two months. At the same time, I was vocal in voicing to my editor that I was hesitant about this mandated style. I also made the decision that there is no point in continuing at the first <em>formal</em> feedback session. I&#x2019;m not sure I could have come to this conclusion any further, as I was still learning how this book publisher worked, up until that point.</p><p>To show how professional this team was, this is the termination letter they sent as a signed PDF:</p><p>&#x201C;This letter is in reference to our Publishing Agreement with you for [what would become The Software Engineer&#x2019;s Guidebook] dated January 11, 2020. By mutual agreement, we are terminating the publishing contract.</p><p>Since no advance was paid to you under the terms of this contract, all rights in the content you originally submitted will hereby return to you and we will consider this matter concluded.</p><p>The decision to cancel a project is never an easy one to make. We thank you for all the efforts on this project that you made and wish you the best in your future endeavors.&#x201D;</p><p><strong>At this point, I learned enough about publishers and myself to decide: I&#x2019;m doing it by myself. </strong>Having my book accepted by a major publisher gave external validation that there&#x2019;s a strong business case for The Software Engineer&#x2019;s Guidebook. And working with an opinionated publisher &#x2013; and continuously pushing back on styling suggestions made me realize that I already have my own opinonated style that I <em>like</em> using.</p><p>I did lose a very important thing by deciding to self-publish: the accountability of meeting a publishing deadline. Working with the publisher, this book would have been out fall 2020 or spring 2021. Self-publishing, I launched it November 2023.</p><p>One of the reasons for publishing my book two years later than it would have taken with a publisher was because I now <em>knew</em> I could no longer rely on a well-known publisher to lend my book their brand. For my book to have an even slim chance of being successful: I would have to compensate for the lack of being associated with a publisher, and fill the gap in marketing and awareness, leading up to the book launch.</p><p>Not having a publisher was a reason I started writing <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><u>The Pragmatic Engineer Newsletter</u></a> in August 2021 (a year-and-a-half after breaking up with this publisher) &#x2013; and the sudden success of this newsletter gave me less time to wrap up the book. At the same time, by the time the book was ready, there were plenty of people who looked forward to reading it: and many of them were already readers of the newsletter!</p><p>I&#x2019;ll cover more about how I went about the actual self-publishing process in a follow-up article, how the book ended up selling, and other learnings. Subscribe <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><u>to The Pragmatic Engineer</u></a> to get notified when it is out.</p><p></p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Amazon layoffs – AI or economy to blame?]]></title><description><![CDATA[Amazon is doing more mass layoffs, claiming it wants to be more nimble. But are job losses really about US economic fears, and how Amazon’s retail business will be affected?]]></description><link>https://blog.pragmaticengineer.com/the-pulse-amazon-layoffs/</link><guid isPermaLink="false">690ccf1cece43400015a8f22</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 06 Nov 2025 16:40:34 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/" rel="noopener noreferrer nofollow"><em> an email you could send to your manager</em></a><em>.</em></p><hr><p>Online retail giant Amazon unexpectedly announced 14,000 job cuts earlier last week. The massive round of layoffs at the company follows other mass redundancies in recent years:</p><ul><li><strong>January 2023</strong>: 18,000 people <a href="https://newsletter.pragmaticengineer.com/i/70995338/amazon-to-lay-off-more-people-and-rescind-more-offers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">laid off</a>.</li><li><strong>March 2023</strong>: another 9,000 people <a href="https://www.businessinsider.com/amazon-layoffs-second-round-9000-job-jobs-2023-3?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a></li><li><strong>November 2023</strong>: hundreds of people <a href="https://www.businessinsider.com/amazon-layoffs-memo-hundreds-job-cuts-alexa-agi-team-2023-11?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a> inside the Alexa team, as Amazon was looking to shift Alexa more toward GenAI</li><li><strong>April 2024</strong>: hundreds of people <a href="https://www.businessinsider.com/amazon-job-cuts-aws-roles-cloud-computing-division-aws-2024-4?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a> from AWS</li></ul><p>Software engineers, unfortunately, seem hit hard by the latest layoffs: of 2,300 employees laid off in Washington State, 25% are software engineers, GeekWire <a href="https://www.geekwire.com/2025/amazon-layoffs-hit-software-engineers-hardest-in-washington/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reports</a>. <em>We can only speculate about the ratio across the rest of the company, but if cuts at HQ are heavy on engineering, then things don&#x2019;t look promising for other locations, sadly.</em></p><p><a href="https://www.aboutamazon.com/news/company-news/amazon-workforce-reduction?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The memo</a> from Beth Galetti, Senior Vice President of People Experience and Technology, to workers didn&#x2019;t explain much:</p><blockquote>&#x201C;Some may ask why we&#x2019;re reducing roles when the company is performing well. Across our businesses, we&#x2019;re delivering great customer experiences every day, innovating at a rapid rate, and producing strong business results. What we need to remember is that the world is changing quickly. This generation of AI is the most transformative technology we&#x2019;ve seen since the Internet, and it&#x2019;s enabling companies to innovate much faster than ever before (in existing market segments and altogether new ones). We&#x2019;re convinced that we need to be organized more leanly, with fewer layers and more ownership, to move as quickly as possible for our customers and business.&#x201D;</blockquote><p>The statement is utterly confusing, as encapsulated by its message that &#x201C;business is great, but we need to do layoffs&#x201D;. Job cuts usually mean a business is in trouble, which obviously isn&#x2019;t the case for Amazon. So, why are these layoffs <em>really</em> happening?</p><h3 id="layoffs-to-boost-efficiency">Layoffs to boost efficiency?</h3><p>The company&#x2019;s memo states:</p><blockquote>&#x201C;We&#x2019;re convinced that we need to be organized more leanly, with fewer layers and more ownership, to move as quickly as possible for our customers and business.&#x201D;</blockquote><p>If this line of reasoning sounds familiar, it&#x2019;s because most of the layoffs in 2023 were justified the same way. The tech industry overhired during the pandemic in 2020-2021, making orgs more bloated and decision-making slower. In February 2023, I reported on <a href="https://newsletter.pragmaticengineer.com/p/the-scoop-38?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the trend of fewer middle managers</a>, with Meta the first Big Tech giant to reduce its management layers. In 2023, most of Big Tech followed this approach with layoffs or reorgs. Managers acquired more reports, and tech companies cut down the number of layers between the CEO and individual contributors.</p><p>Given Amazon did other massive layoffs in 2023, it&#x2019;s unlikely they missed the industrywide trend for fewer managers. While the current layoffs seem to be targeting managers quite a bit &#x2013; from the Washington State layoffs, 20% of those let go <a href="https://www.geekwire.com/2025/amazon-layoffs-hit-software-engineers-hardest-in-washington/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">are managers</a> &#x2013; there are still more ICs laid off than managers, overall. So, this official explanation doesn&#x2019;t pass my personal &#x201C;smell test&#x201D;.</p><h3 id="layoffs-to-buy-more-gpus">Layoffs to buy more GPUs?</h3><p>The day after its jobs announcement, Amazon had more big news, this time about AI: it unveiled Project Rainer, the largest AI computing platform AWS has ever built. It already has 500,000 Trainium2 chips (built by Amazon), This capacity is already 70% larger than any AI computing platform in AWS&#x2019;s history, and Anthropic is using all of it (!!) to train its next models. Below is an image of one of the several Project Rainer data centers packed with Amazon GPUs:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp" class="kg-image" alt loading="lazy" width="1082" height="1072" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 1082w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The next generation of Claude models is trained in these data centers. Source: </em></i><a href="https://x.com/ajassy/status/1983616724642730217?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Amazon</em></i></a></figcaption></figure><p>Building data centers is incredibly capital-intensive: Amazon has <a href="https://www.cnbc.com/2025/10/29/amazon-opens-11-billion-ai-data-center-project-rainier-in-indiana.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">spent</a> $11B on Project Rainer alone. Even though very profitable, Amazon might want to invest <em>more</em> cash than it currently has into building data centers. So, one reason for job cuts could be to reallocate financial resources from paying salaries and compensation towards building more data centers.</p><p>Before doing the math, a couple of concepts are important to understand:</p><ul><li><strong>Free cash flow (FCF)</strong>: profit after payment for things like capital expenditure (CapEx), such as financing data centers. If a company wants to operate with as little debt as possible, FCF is usually very important. If Amazon wants to avoid loans and not touch its reserves, then data center investment would come from FCF, reducing FCF further.</li><li><strong>Cash reserves: </strong>a company&#x2019;s <em>liquid</em> reserve investments, usually an accumulation of investments in financial instruments like bonds and securities, or cash deposits.</li></ul><p>Let&#x2019;s run Amazon&#x2019;s numbers:</p><ul><li><strong>Cash reserves: $93B. </strong>This is how much Amazon has in reserve.</li><li><strong>FCF: $32B. </strong>This is the rough free cash flow Amazon has currently, as per its latest quarterly report. This is after deducting <em>current</em> data center investments.</li><li><strong>Savings from layoffs: $2-4B. </strong>This is my estimate of the rough total compensation of 14,000 employees.</li></ul><p>So, the savings from these layoffs wouldn&#x2019;t even pay for half of Project Rainer ($11B in total), and Amazon could easily build 3x Project Rainers in the next year, without needing to dip into its savings! Of course, Amazon has its famous frugality principle, but this massive layoff of 14,000 people won&#x2019;t make a big difference to how much it can invest in data centers; It can already spend much more, if it wants!</p><h3 id="leanness-and-ai-fail-job-cuts-%E2%80%9Csmell-test%E2%80%9D">Leanness and AI fail job cuts &#x201C;smell test&#x201D;</h3><p>It&#x2019;s not only me who doesn&#x2019;t buy the explanation that these layoffs are to streamline the company, or to redirect resources to AI. <a href="https://www.linkedin.com/in/arneknudson/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Arne Knudson</a> worked at Amazon for nearly two decades, most recently as a software development manager (SDM), before leaving the company earlier this year. He <a href="https://www.linkedin.com/posts/arneknudson_in-my-18-years-at-amazon-i-went-through-activity-7388737736590909440-wJ1M?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAAIk0KwBsmE3oBadWSg2ettxmEyKbqZKG34" rel="noopener noreferrer nofollow">shared</a> his analysis, with some insider detail:</p><blockquote>&#x201C;In my 18 years at Amazon, I went through a few layoffs and hiring freezes.<br><br>This is the first time I&#x2019;ve seen multiple years of significant layoffs essentially back-to-back. Even in the depths after the .com bubble, it wasn&#x2019;t this bad. They&#x2019;ve been laying people off now for almost 3 straight years.<br><br><strong>The explanation that this is downsizing after hiring too many at the height of the pandemic doesn&#x2019;t pass the smell test, at least to me. </strong>That was 3 years ago; they&#x2019;re not that dumb to keep those people around for 3 extra years. Those folks were laid off back in &#x2018;22.<br><br><strong>I&#x2019;m also not convinced that this is optimization due to AI.</strong> My degree&#x2019;s in AI, and I worked on AI stuff at Amazon; I don&#x2019;t think there&#x2019;s enough automation yet, and it&#x2019;s not accurate enough yet, to replace 30,000 people. The cost of inaccuracies seems too high. But I could be wrong; maybe they&#x2019;ve gotten their false negative &amp; false positive rates low enough to avoid too many region-wide AWS outages. (Or not.)<br><br>One of the articles I read said this was going to be in HR, and I can tell you as a former manager, my experience working with HR had been steadily worsening over the past 5-7 years. They outsourced so much of the work, overworked the people they had, and had such high turnover that I never knew who I was supposed to work with. When I needed to put someone on a performance plan or help a new hire receive some kind of accommodation, it seemed like it was a different person each time. If they really are laying off tens of thousands more HR folks, this is only going to get worse.<br><br><strong>And, I suspect, it means they don&#x2019;t plan on hiring MORE people in any of the business units for a year or more. </strong>So, by the smell-o-meter, this seems more significant than streamlining the workforce, improved AI, and &#x201C;nah, we don&#x2019;t need as many HR folks.&#x201D;</blockquote><h3 id="us-economy-to-blame-for-amazon-layoffs">US economy to blame for Amazon layoffs?</h3><p>It&#x2019;s safe to assume AWS as a business unit is doing just fine, as suggested by Project Rainer&#x2019;s existence and the agenda for building data centers. But how is the e-commerce side of the business performing, and what&#x2019;s its outlook?</p><p>If one business should have its finger on the pulse of the US economy, it&#x2019;s Amazon with its size and self-professed, relentless customer focus, providing a window into people&#x2019;s spending habits across the country. Flashing lights on the dashboard of the national economy may signal tough times ahead in e-commerce, which could be a reason to start cutting costs early.</p><p><strong>There are concerning signs from other sectors about the US economy. </strong>Below is the CEO of the restaurant chain, Chipotle.</p><blockquote>&#x201C;Earlier this year, as consumer sentiment declined sharply, we saw a broad-based pullback in frequency across all income cohorts.<br><br>Since then, the gap has widened, with low to middle-income guests further reducing frequency. We believe that this guest with household income below $100,000, represents about 40% of our total sales. And, based on our data, they are <strong>dining out less often due to concerns about the economy, and inflation.</strong><br><br>A particularly challenged cohort is the 25- to 35-year-old age group. We believe that this trend is not unique to Chipotle and is occurring across all restaurants as well as many discretionary categories. This group is facing several headwinds, including unemployment, increased student loan repayment and slower real wage growth. We tend to skew younger and slightly over-indexed to this group relative to the broader restaurant industry&#x201D;.</blockquote><p>Chipotle is saying that everyone is eating out less, particularly 25-35 year olds, because of inflation. If people spend less on Chipotle because of rising prices, then they may also spend less in other areas of their lives for the same reason, including on Amazon.</p><p>In the e-commerce supply chain, there&#x2019;s evidence of this trend, which would mean delivery services like UPS have fewer parcels to deliver. Speaking of UPS, two days ago, it <a href="https://nypost.com/2025/10/28/business/ups-axes-48000-workers-in-sweeping-cost-cut-push-sparking-stock-surge/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">announced</a> massive layoffs:</p><blockquote>&#x201C;United Parcel Service (UPS) has slashed 48,000 jobs this year &#x2014; one of the largest single-year reductions by a US company since the pandemic &#x2014; as the package giant scrambled to contain costs and revive its lagging stock price.<br><br>The Atlanta-based delivery behemoth disclosed the reductions Tuesday while reporting third-quarter earnings that beat Wall Street expectations.<br><br>UPS said 34,000 of the cuts hit drivers and warehouse operations, while 14,000 targeted management (...)<br><br>UPS shares jumped nearly 9% in Tuesday afternoon trading, even as the company reported weaker revenue and profits&#x201D;.</blockquote><p>UPS&#x2019;s revenue is down on last year, which suggests that there are, indeed, fewer deliveries (or lower value ones). As with the latest job cuts at Amazon, these drastic layoffs could be explained by a lot of things, most easily by UPS expecting reduced trade in the future.</p><p><strong>If US consumer spending is trending down, then the e-commerce sector will be among the first to feel this. </strong>It could explain why Amazon is making these layoffs now. It can also explain why Google, Meta, and Microsoft might not be seeing their businesses impacted: they&#x2019;re not involved in retail like Amazon is, and the AI sector <em>is</em> very much booming.</p><p>Among all of Big Tech, Amazon is best positioned to detect changes in US consumer spending. Google&#x2019;s and Meta&#x2019;s revenue is more dependent on advertising, and Microsoft&#x2019;s more on enterprise spend. Like Amazon, Apple is well placed to feel market changes with its range of smartphones and watches, and other consumer tech.</p><p>I believe Amazon is highly commercially rational, so it&#x2019;s worth understanding the <em>actual</em> reason for its second major mass layoffs in just two years, following deep cuts in 2023. I&#x2019;d put my money on this reason being the economy, and how Amazon probably expects customers to cut back their spending everywhere, including on Amazon.</p><hr><p>This was one out of the four topics covered in last week&#x2019;s The Pulse. <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full article.</a></p><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-150-cursor-and-github-double?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>This week&#x2019;s The Pulse</strong></a> additionally covers:</p><ol><li><strong>Cursor and GitHub double down on agents. </strong>Each company is focusing on agents: Cursor with its multi-agent mode, and GitHub with its &#x201C;Agent HQ.&#x201D; Cursor is increasingly a direct rival to GitHub.</li><li><strong>Industry pulse. </strong>Meta rolls out AI-assisted interviews, Cursor and Windsurf believed to be using Chinese open source AI models, South Korean government pays price of ignoring backup &#x201C;101,&#x201D; startups growing much faster in the US than in Europe, companies using AI tools buy more JIRA seats, a neat uptime service called Updog, and more.</li><li><strong>OpenAI inflating the bubble? </strong>OpenAI signs another massive deal with AWS based on predicted growth, and seeks taxpayer protection to borrow more.</li><li><strong>Large tech companies struggle to build their AI integrations</strong>. Apple admits failure to modernize Siri by paying Google $1B per year for its LLM. Perplexity to pay Snap $400M to be its AI search interface.</li><li><strong>How much do Directors of Engineering earn at startups?</strong> Data from Carta says it&#x2019;s more than any other Director: $215-230K at companies valued at $25-250M.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-150-cursor-and-github-double?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a></p>]]></content:encoded></item><item><title><![CDATA[Comparing interviews at 8 large tech companies]]></title><description><![CDATA[Puneet Patwari applied to 8 major tech companies, and received 6 offers. He compares his interview experiences at Meta, Amazon, Uber, and 5 other workplaces]]></description><link>https://blog.pragmaticengineer.com/comparing-interviews-at-8-large-tech-companies/</link><guid isPermaLink="false">6903a79017b0a200016fa3a2</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 30 Oct 2025 18:00:42 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one topic from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The Pulse #149</em></a><em>. Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p><a href="https://www.linkedin.com/in/puneet-patwari/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Puneet Patwari</a> recently accepted an offer to join Atlassian as a Principal Software Engineer. In three months, he did more than 60 interviews at 11 companies, he told me &#x2013; while dropping out of 3 more interview processes after accepting the Atlassian offer, including that of Meta. Following that endeavour, he has <a href="https://www.linkedin.com/posts/puneet-patwari_i-interviewed-at-google-uber-walmart-amazon-activity-7379375609354883072-Rkq1/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">compared</a> the interview processes of the largest companies:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!fd6d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F411215af-a63b-411f-9192-d6a7ef71481e_1390x1236.png" class="kg-image" alt loading="lazy" width="1390" height="1236"><figcaption><i><em class="italic" style="white-space: pre-wrap;">What each interview process was like. Source: </em></i><a href="https://www.linkedin.com/posts/puneet-patwari_i-interviewed-at-google-uber-walmart-amazon-activity-7379375609354883072-Rkq1/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Puneet Patwari</em></i></a></figcaption></figure><p>A few more observations that Puneet shared with me:</p><blockquote><strong>Amazon</strong>: the Amazon Hiring Manager round was one of the most unique I ever experienced. We got so engrossed in the discussion that it took 160 minutes instead of the scheduled 60 minutes! We had to take a break in between the interview process.<br><br><strong>Atlassian</strong>: The leadership craft (LC) &amp; values were two interview rounds which were very crucial in determining that I&#x2019;ll be levelled at the Principal level. Of course, the Systems Design interview was also key here. Atlassian puts a lot of emphasis on LC for Principal engineers.<br><br><strong>Salesforce</strong>: the system design round was based on the <em>actual</em> job requirement. It was a migration problem where the interviewer wanted to check if I can own a project end-to-end with customers at the centre of it.<br><br><strong>Confluent</strong>: when I say it was the most mentally demanding interview, what I mean is how every skill was tested with two interviews! So 2x data structures and algorithms (DSA), 2x System Design 2x behavioural interview rounds.<br><br><strong>I cannot stress enough how important behavioural interviews are at the Staff+ levels. </strong>Doing well on these interviews were decisive in getting Staff and Principal-level offers. Of course, you needed to do well on coding and systems design: but my sense was that the behavioural parts were make or break for levelling and getting an offer.</blockquote><p>A few things stand out to me from Puneet&#x2019;s account of his interviews at leading tech companies:</p><ul><li><strong>Algorithmical coding interviews are everywhere! </strong>For senior+ positions, you need to get really good at these, including challenging topics like dynamic programming. We cover how to perform well in these in the article, <a href="https://newsletter.pragmaticengineer.com/p/how-to-get-unstuck-during-coding-interviews?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">How experienced engineers get unstuck in coding interviews</a></li><li><strong>Interviews are tough, and time consuming. </strong>Even after Puneet had offers, no company shortened their process. Puneet had to decline 3 more interviews &#x2013; including one at Meta &#x2013; because by the time the interviews would have come around, he already had an offer he had accepted at Atlassian.</li><li><strong>In a tough job market, &#x201C;top&#x201D; candidates are still in demand. </strong>We&#x2019;ve covered <a href="https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025-hiring-managers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">how challenging the current tech labor market is for jobseekers</a>, but Puneet interviewed at 11 companies and got 6 offers. His applications had to have a lot going for them in order to pass the resume screenings: 10+ years of experience, and working as a Senior Software Engineer at Microsoft. He also showed up <em>really</em> well prepared.</li><li><strong>Bad luck can strike at any time</strong>. Puneet&#x2019;s interview experience at Uber seems to have been a bit unlucky: the interviewer presented as rigid and not open to dialogue. Perhaps they were having a tough day, or wanted to get the interview over with. Or it could be what Steve Yegge describes as the <a href="https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">interviewer anti-loop</a></li></ul><p>Congrats to Puneet for accepting the Atlassian position, and thanks for sharing all these learnings!</p><hr><p>This was one out of five topics covered in <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Pulse #149</a>. The full edition additionally covers:</p><ul><li><strong>New trend: programming by kicking off parallel AI agents. </strong>More devs are experimenting with kicking off coding agents in parallel</li><li><strong>ACP protocol.</strong> A new protocol built by the Zed team, which tries to make it easier to build AI tooling for IDEs than the MCP protocol allows</li><li><strong>AI security tooling works surprisingly well?</strong> AI-powered security tools seem good at identifying security flaws in mature open source projects</li><li><strong>Is AI the only engine of US economic growth?</strong> Forty percent of US GDP this year is based on AI-related spend, while 60% of venture capital goes into AI. Hopefully, it won&#x2019;t end up as a bubble which bursts like in 2001</li></ul><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a>, and check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">today&#x2019;s The Pulse here</a>.</p>]]></content:encoded></item><item><title><![CDATA[New trend: programming by kicking off parallel AI agents]]></title><description><![CDATA[More devs are experimenting with kicking off coding agents in parallel]]></description><link>https://blog.pragmaticengineer.com/new-trend-programming-by-kicking-off-parallel-ai-agents/</link><guid isPermaLink="false">6903a70817b0a200016fa357</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 30 Oct 2025 17:58:38 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one topic from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The Pulse #149</em></a><em>. Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p>With agentic command line interfaces like Claude Code, OpenAI Codex, Cursor, and many others going mainstream, I&#x2019;m seeing a trend of more software engineers experimenting with kicking off work with several agents simultaneously on separate tasks:</p><p>I talked with Anthropic engineer Sid Bidasaria about <a href="https://newsletter.pragmaticengineer.com/p/how-claude-code-is-built?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">how Claude Code is built</a>, and at the end of our conversation, he mentioned that he&#x2019;d had a few agents running throughout and that it made him more productive with work. Similarly, software engineer Simon Willison, whom I consider an AI engineering expert, has posted about &#x201C;<a href="https://simonwillison.net/2025/Oct/5/parallel-coding-agents/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">embracing the parallel coding agent lifestyle</a>.&#x201D; He writes:</p><blockquote>&#x201C;For a while now, I&#x2019;ve been hearing from engineers who run multiple coding agents at once&#x2014;firing up several Claude Code or OpenAI Codex instances at the same time, sometimes in the same repo, sometimes against multiple checkouts or git worktrees.<br><br>I was pretty skeptical about this at first. AI-generated code needs to be reviewed, which means the natural bottleneck on all of this is how fast I can review the results. It&#x2019;s tough keeping up with just a single LLM given how fast they can churn things out, where&#x2019;s the benefit from running more than one at a time if it just leaves me further behind?<br><br>Despite my misgivings, over the past few weeks I&#x2019;ve noticed myself quietly starting to embrace the parallel coding agent lifestyle.<br><br>I can only focus on reviewing and landing one significant change at a time, but I&#x2019;m finding an increasing number of tasks that can still be fired off in parallel without adding too much cognitive overhead to my primary work.&#x201D;</blockquote><p>Simon <a href="https://simonwillison.net/2025/Oct/5/parallel-coding-agents/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">shares advice about what works for him</a>, with research, maintenance tasks, and directed work all mentioned as use cases.</p><p><strong>It&#x2019;s interesting to consider whether parallel work with agents has the potential to overturn decades of software engineering practices. </strong>Let&#x2019;s assume software engineers who kick off multiple agents at once do become more productive than &#x201C;single-threaded&#x201D; peers who work on one problem at a time. If so, then this practice has a chance to spread, should enough software engineers seek to be more productive &#x2013; or want to avoid being left behind by some colleagues doing more than before.</p><p>But engineering in the pre-AI era was all about being in the flow for many productive engineers. A flow state goes something like this:</p><ul><li>Understand the moving parts</li><li>Build a solution, validate it, iterate on it</li><li>When satisfied with how it works, submit a pull request for code review &#x2014; or, if no review is needed, just merge and ship it</li></ul><p>Interrupting this process disrupts the flow state, and it takes time to get back into it: it&#x2019;s why software engineers tend to prioritize focus time, to make progress with coding work.</p><p>Of course, this isn&#x2019;t universal among all highly productive engineers; when I was an engineering manager, the most productive engineers on my team did a lot of context switching and were adept at juggling several things at once. Here&#x2019;s an average-looking day for a senior engineer acting as a tech lead:</p><ul><li><strong>Code reviews. </strong>Arrive at office, go through open code reviews from the previous night</li><li><strong>Coding.</strong> Get some of their own coding work done</li><li><strong>Standup.</strong> The usual</li><li><strong>More coding.</strong> Get the work done. <em>At least, that&#x2019;s the idea. In reality:</em></li><li><strong>Interruptions: </strong>code reviews, requests for help, taps on shoulder. The most productive engineer on a team regularly gets messages requesting code reviews to unblock teammates, or to help someone else who&#x2019;s stuck, or the manager (me &#x2013; sorry!) tapping them on the shoulder for help with something.</li></ul><p><strong>I wonder if senior+ engineers will be &#x201C;naturals&#x201D; at working with parallel AI agents, </strong>based on their existing habits and what they do currently:</p><ul><li>Keep parallel workflows in their heads; e.g., what team members are doing at any one time.</li><li>Code reviews across several workstreams: they&#x2019;re the <em>go-to</em> code reviewer, and usually review all code changes across 2-5 workstreams. They may not do the work, but know when it&#x2019;s correct.</li><li>Can handle interruptions: they&#x2019;ve learned how to make progress when their focus is continually being broken.</li><li>Good at directing colleagues: because they&#x2019;re regularly interrupted, they&#x2019;ve also learned how to delegate and explain urgent work to team members.</li><li>Writing skill: these engineers write a lot of code reviews, draw up documents like RFCs that outline work, create tickets to break down projects, and critique colleagues&#x2019; efforts; all this involves communicating effectively in writing.</li></ul><p>With AI agents, the qualities that make a good tech lead are within reach for engineers who want to be more productive. So far, the only people I&#x2019;ve heard are using parallel agents successfully are senior+ engineers.</p><p>Then again, this workflow hasn&#x2019;t stuck with everyone: I asked Flask creator <a href="https://newsletter.pragmaticengineer.com/p/python-go-rust-typescript-and-ai?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Armin Ronacher</a> about his experience with parallel agents. He told me:</p><blockquote>&#x201C;I sometimes kick off parallel agents, but not as much as I used to do.<br><br>The thing is: it&#x2019;s only so much my mind can review!&#x201D;</blockquote><p><strong>But we&#x2019;re in new territory now that any dev can kick off parallel coding with coding agents. </strong>Will it make engineers more productive, or will it just make people <em>feel</em> like they&#x2019;re more productive? Perhaps engineers who do one thing at a time and keep focus will be shown to produce more reliable software, over time. Or maybe it&#x2019;ll turn out that working with parallel agents leads to more issues slipping through and more iterations, which destroys any gains.</p><p>We will find out. Personally, I can only see more devs experimenting with parallel agents.</p><p><strong>My sense is that software engineering basics matter more when working with AI agents. </strong>I&#x2019;ve started to use AI agents for my own side projects, with success so far. I do a few things:</p><ul><li>Testing: all side projects have unit tests because I learned to not trust my own work without validation</li><li>Small, descriptive tasks: I give tasks small enough in scope, which I explain, and give examples of</li><li>Refactoring: every third or fourth task is for the agent to refactor some code they wrote (e.g., extract into a method, move to a new class)</li><li>Review: I track what the agent does</li><li>Do small things personally: I keep my IDE open and do anything that&#x2019;s a few lines to change by hand, so I stay aware of the codebase</li></ul><p>I keep hearing the same from other engineers: &#x201C;mandating&#x201D; engineering practices like having the agent pass all tests before continuing, leads to better results. This is unsurprising and it&#x2019;s why these practices are getting popular. AI agents are non-deterministic and to some extent unreliable; these practices make them a lot more reliable and usable.</p><hr><p>This was one out of the five topics covered in <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Pulse #149</a>. The full edition additionally covers: The full issue additionally covers:</p><ul><li><strong>ACP protocol.</strong> A new protocol built by the Zed team, which tries to make it easier to build AI tooling for IDEs than the MCP protocol allows</li><li><strong>AI security tooling works surprisingly well?</strong> AI-powered security tools seem good at identifying security flaws in mature open source projects</li><li><strong>Is AI the only engine of US economic growth?</strong> Forty percent of US GDP this year is based on AI-related spend, while 60% of venture capital goes into AI. Hopefully, it won&#x2019;t end up as a bubble which bursts like in 2001</li><li><strong>Comparing interviews at 8 large tech companies.</strong> <a href="https://www.linkedin.com/in/puneet-patwari/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Puneet Patwari</a> applied to 8 major tech companies, and received 6 offers. He compares his interview experiences at Meta, Amazon, Uber, and 5 other workplaces</li></ul><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a>, and check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">today&#x2019;s The Pulse here</a>.</p>]]></content:encoded></item><item><title><![CDATA[What caused the large AWS outage?]]></title><description><![CDATA[On Monday, a major AWS outage hit thousands of sites & apps, and even a Premier League soccer game. An overview of what caused this high-profile, global outage]]></description><link>https://blog.pragmaticengineer.com/aws-outage-us-east-1/</link><guid isPermaLink="false">68fa5697b1a28700018a70af</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 23 Oct 2025 16:26:27 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from today&#x2019;s deepdive into </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>the recent AWS outage</em></a><em>. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><p>Monday was an interesting day: Signal stopped working, Slack and Zoom had issues, and most Amazon services were also down, together with thousands of websites and apps, across the globe. The cause was a 14-hour-long AWS outage in the us-east-1 region.</p><p>Today, we look into what caused this outage.</p><p>To its credit, AWS posted continuous updates throughout the outage. Three days after the incident, they <a href="https://aws.amazon.com/message/101925/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">released</a> a detailed postmortem &#x2013; much faster than <a href="https://newsletter.pragmaticengineer.com/p/three-cloud-providers-three-outages?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the 4 months it took </a>in 2023 after a similarly large event.</p><p><strong>The latest outage was caused by DynamoDB&#x2019;s DNS failure. </strong>DynamoDB is a serverless NoSQL database built for durability and high availability, which promises <strong>99.99%</strong> uptime as its service level agreement (SLA), when set to multi-availability zone (AZ) replication. Basically, when operated in a single region, DynamoDB promises &#x2013; and delivers! &#x2013; very high uptime with low latency. Even better, while the default consistency model for DynamoDB is <a href="https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">eventual consistency</a> (reads might not yet reflect the actual status), reads can also be set to use <a href="https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">strong consistency</a> (guaranteed to return the actual status).</p><p>All these traits make DynamoDB an attractive choice for data storage for pretty much any application, and many of AWS&#x2019;s own services also depend heavily on DynamoDB. Plus, DynamoDB has a track record of delivering on its SLA promises, so the question is often not <em>why</em> to use DynamoDB, but rather, <em>why not to</em> use this highly reliable data storage. Potential reasons for not using it include complex querying, complex data models, or storing large amounts of data when storage costs are not worth it compared to other bulk storage solutions.</p><p>In this outage, DynamoDB went down, and the <strong>dynamodb.us-east-1.amazonaws.com</strong> address returned an empty DNS record. To every service &#x2013; external to AWS or AWS internal &#x2013; it seemed like DynamoDB in this AWS region disappeared off the face of earth! To understand what happened, we need to look into DynamoDB DNS management.</p><h3 id="how-dynamodb-dns-management-happens">How DynamoDB DNS management happens</h3><p>Here&#x2019;s an overview:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/10/Screenshot-2025-10-23-at-14.30.07.png" class="kg-image" alt loading="lazy" width="1802" height="1408" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/10/Screenshot-2025-10-23-at-14.30.07.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/10/Screenshot-2025-10-23-at-14.30.07.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1600/2025/10/Screenshot-2025-10-23-at-14.30.07.png 1600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/10/Screenshot-2025-10-23-at-14.30.07.png 1802w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">DNS management in DynamoDB. Can you see where a race condition could occur?</span></figcaption></figure><p>How it works:</p><ul><li><strong>DNS planner:</strong> this service monitors load balancer (LB) health. As you can imagine, DynamoDB runs at a massive scale, and LBs can easily get overloaded or under-utilized. When overloading happens, new LBs need to be added, and when there&#x2019;s underutilization, they need to be removed. The DNS planner created DNS plans. Each DNS plan is a set of LB sets, and assigning them weights on how much traffic to give the LB.</li><li><strong>DNS enactor: </strong>the service responsible for updating the routes in Amazon&#x2019;s DNS service called Route 53. For resiliency, there is one DNS enactor running in each availability zone (AZ). In us-east-1, there are 3 AZs, and 3x DNS Enactor instances.</li><li><strong>Race conditions are expected:</strong> with 3x parallel DNS Enactors working simultaneously, race conditions are expected. The system deals with this by assuming eventual consistency: even if a DNS Enactor updates Route 53 with an &#x201C;old&#x201D; plan, DNS plans are consistent with one another. Plus, updating happens quickly, and DNS Enactors only use the latest plans from the DNS Planner.</li></ul><h3 id="dynamodb-down-for-3-hours">DynamoDB down for 3 hours</h3><p>Several independent events combined to knock DynamoDB&#x2019;s DNS offline:</p><ol><li><strong>High delays on a DNS Enactor #1. </strong>Updating DNS took unusually long for one DNS Enactor for some reason. Usually, these updates are rapid, but weren&#x2019;t on 20 October.</li><li><strong>DNS Planner turns up the pace in churning out DNS plans. </strong>Just as DNS updates turned slow, the DNS planner started to produce new plans at a much higher pace than before.</li><li><strong>DNS Enactor #2 rapidly processes DNS plans. </strong>While DNS Enactor #1 was applying DNS plans at snail&#x2019;s pace, DNS Enactor #2 was storming through them. As soon as it finished writing these plans to Route 53, it went back to DNS Planner and deleted the old plans.</li></ol><p>These three things pushed the system into an inconsistent state and emptied out DynamoDB DNS:</p><ol><li><strong>DNS Enactor #1 unknowingly uses an old DNS plan. </strong>When DNS Enactor #2 finished applying the newest DNS plan, it went back to DNS Planner and deleted all older plans. Doing so <em>should</em> have meant that other DNS Enactors did not use old plans; but remember, DNS Enactor #1 was slow and still processing through an old plan! As a result, the check by DNS Enactor #1 was stale.</li><li><strong>DNS Enactor #2 detects the old plan being used and clears DNS records. </strong>DNS Enactors have another cleanup check: if they detect an old plan being used, they delete the plan itself. Deleting a plan means removing all IP addresses for the regional endpoints in Route 53. So DNS Enactor #2 turned the dynamodb.us-east-1.amazonaws.com DNS empty!</li></ol><p>DynamoDB going down also took down all AWS services dependent on the us-east-1 DynamoDB services. From the AWS postmortem:</p><blockquote>&#x201C;All systems needing to connect to the DynamoDB service in the N. Virginia (us-east-1) Region via the public endpoint immediately began experiencing DNS failures and failed to connect to DynamoDB. This included customer traffic as well as traffic from internal AWS services that rely on DynamoDB. Customers with DynamoDB global tables were able to successfully connect to and issue requests against their replica tables in other Regions, but experienced prolonged replication lag to and from the replica tables in the N. Virginia (us-east-1) Region.&#x201D;</blockquote><p>The DynamoDB outage lasted around 3 hours; I can only imagine AWS engineers scratching their heads and wondering how the DNS records were emptied. Eventually, engineers manually intervened, and brought back DynamoDB. It&#x2019;s worth remembering that bringing up DynamoDB might have included avoiding the <a href="https://newsletter.pragmaticengineer.com/i/168964142/mitigating-the-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">thundering herd issue</a> that is typical of restarting large services.</p><p><strong>To be honest, I&#x2019;m sensing key details were omitted from the postmortem. </strong>Things unmentioned which are key to understanding what really happened:</p><ul><li>Why did DNS Enactor #1 slow down in updating DNS, compared to DNS Enactor #2?</li><li>Why did DNS Enactor #2 delete all DNS records as part of cleanup? This really made no sense, and it feels like there&#x2019;s an underlying reason.</li><li>The race condition of one DNS Enactor being well ahead of others seems to be easy enough to forecast. Was this the first time it happened? If not, what happened after previous, similar incidents?</li><li>Most pressingly, how will the team fix this vulnerability which could happen anytime in the future?</li></ul><h3 id="amazon-ec2-down-for-12-more-hours">Amazon EC2 down for 12 more hours</h3><p>With DynamoDB restored, the pain was still not over for AWS. In fact, Amazon EC2&#x2019;s problems just got worse. To understand what happened, we need to understand how EC2 works:</p><ul><li><strong>DropletWorkflow Manager (DWFM) </strong>is the subsystem that manages physical servers for EC2. Think of it as the &#x201C;Kubernetes for EC2.&#x201D; EC2 instances are called &#x201C;droplets.&#x201D;</li><li><strong>Lease</strong>: DropletWorkflow Manager tracks the lease for each droplet (server) so it knows if and when a server is occupied, or can be allocated to an EC2 customer. The DWFM does a status check of the server every few minutes to determine its state.</li></ul><p>State check results are stored in DynamoDB, so the DynamoDB outage caused problems:</p><p><strong>1. Leases started to time out. </strong>With state check results not returning due to the DynamoDB outage, the DropletWorkflow Manager started to mark droplets as not available.</p><p><strong>2. Insufficient capacity errors on EC2: </strong>with most leases timed out, DWFM started to return &#x201C;insufficient capacity error&#x201D; messages to EC2 customers. It <em>thought</em> servers were not available, after all.</p><p><strong>3. DynamoDB&#x2019;s return didn&#x2019;t help: </strong>when DynamoDB came back online, it should have been possible to update the status of droplets. But that didn&#x2019;t happen. From the postmortem:</p><blockquote>&#x201C;Due to the large number of droplets, efforts to establish new droplet leases took long enough that the work could not be completed before they timed out. Additional work was queued to reattempt establishing the droplet lease. At this point, DWFM had entered a state of <strong>congestive collapse</strong> and was unable to make forward progress in recovering droplet leases.&#x201D;</blockquote><p>It took engineers 3 more hours to come up with mitigations to get EC2 instance allocation working again.</p><p><strong>Network propagation errors took another 5 hours to fix.</strong> Even when EC2 was looking healthy on the inside, instances could not communicate with the outside world, and congestion built up inside a system called Network Manager. Also from the postmortem:</p><blockquote>&#x201C;Network Manager started to experience increased latencies in network propagation times as it worked to process the backlog of network state changes. While new EC2 instances could be launched successfully, they would not have the necessary network connectivity due to the delays in network state propagation. Engineers worked to reduce the load on Network Manager to address network configuration propagation times and took action to accelerate recovery. By 10:36 AM PT [11 hours after the start of the outage], network configuration propagation times had returned to normal levels, and new EC2 instance launches were once again operating normally.&#x201D;</blockquote><p><strong>Final cleanup took another 3 hours. </strong>After all 3 systems fixed &#x2013; DynamoDB, EC2&#x2019;s DropletWorkflow Manager and Network Manager &#x2013; there was a bit of cleanup left to do:</p><blockquote>&#x201C;The final step towards EC2 recovery was to fully remove the request throttles that had been put in place to reduce the load on the various EC2 subsystems. As API calls and new EC2 instance launch requests stabilized, at 11:23 AM PDT [12 hours after the outage started] our engineers began relaxing request throttles as they worked towards full recovery. At 1:50 PM [14 hours after the outage started], all EC2 APIs and new EC2 instance launches were operating normally.&#x201D;</blockquote><p>Phew &#x2013; that was a lot of work! Props to the AWS team for working through it all in what must have been a stressful night&#x2019;s work. You can <a href="https://aws.amazon.com/message/101925/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">read the full postmortem here</a>, which details the impact on other services like the Network Load Balancer (NLB), Lambda functions, Amazon Elastic Container Service (<strong>ECS</strong>), Elastic Kubernetes Service (<strong>EKS</strong>), <strong>Fargate</strong>, Amazon Connect,<strong> AWS Security Token Service</strong>, and AWS Management Console.</p><hr><p>This was one out of four topics from today&#x2019;s The Pulse, analyzing this large AWS outage. The <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">full issue additionally covers:</a></p><ol><li><strong>Worldwide impact. </strong>From Ring cameras, Robinhood, Snapchat, and Duolingo, all the way to Substack &#x2013; sites and services went down in their thousands.</li><li><strong>Unexpected AWS dependencies. </strong>Status pages using Atlassian&#x2019;s Statuspage product could not be updated, Eight Sleep mattresses were effectively bricked for users, Postman was unusable, UK taxpayers couldn&#x2019;t access the HMRC portal, and a Premier League game was interrupted.</li><li><strong>Why such dependency on us-east-1?</strong> It feels like half of the internet is on us-east-1 for its low pricing and high capacity. Meanwhile, some AWS services are themselves dependent on this region.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Read the full article here.</strong></a></p>]]></content:encoded></item></channel></rss>