<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[The Pragmatic Engineer]]></title><description><![CDATA[Observations across the software engineering industry.]]></description><link>https://blog.pragmaticengineer.com/</link><image><url>https://blog.pragmaticengineer.com/favicon.png</url><title>The Pragmatic Engineer</title><link>https://blog.pragmaticengineer.com/</link></image><generator>Ghost 6.36</generator><lastBuildDate>Thu, 07 May 2026 14:20:05 GMT</lastBuildDate><atom:link href="https://blog.pragmaticengineer.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[The Pulse: token spend breaks budgets – what next?]]></title><description><![CDATA[In the past 2-3 months, spending on AI agents has exploded at many tech companies, Details from 15 of them, including the different ways they are coping with this realization.]]></description><link>https://blog.pragmaticengineer.com/the-pulse-token-spend-breaks-budgets-what-next/</link><guid isPermaLink="false">69f36c81ac26b70001aa306d</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 30 Apr 2026 14:52:36 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of three topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-ai-token-spending-out-of?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the article below seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em> subscribe here</em></a><em>.</em></p><p>Last week, we covered <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-tokenmaxxing-as-a-weird?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the slightly perverse trend of &#x201C;tokenmaxxing&#x201D;</a> across the industry, where devs run agents with the sole aim of boosting their personal &#x201C;token stats&#x201D; in an effort to rank higher on internal token leaderboards, and not be seen as a Luddite who doesn&#x2019;t use AI tools enough compared to peers.</p><p>This week, I spoke with a software engineer at a large company and another at a seed-stage place. Both shared almost identical stories: at their latest all-hands, company leadership expressed concerns about the fast-rising costs of tokens. At both places, token spend has increased by ~10x in the last six months &#x2013; with no signs of slowing down.</p><p>I wanted to find out about this trend, so I talked to devs at 15 businesses. Below is what I learned about what&#x2019;s happening in workplaces of all sizes. Names are anonymized.</p><h2 id="large-companies">Large companies</h2><h4 id="setting-the-default-model-to-a-cheaper-one-10000-person-saas-company-offices-on-all-continents">Setting the default model to a cheaper one: 10,000+ person SaaS company, offices on all continents</h4><p>Inside a large SaaS company, most devs use an internal background coding tool for coding. This model defaults to Claude Sonnet, which is the cheaper Claude version. Model selection is not persisted, so devs who prefer working with Opus, for instance, must reselect it on every subsequent startup.</p><p>This tool supports all major frontier models such as Sonnet, Opus, GPT, and Gemini. Devs at the company whom I talked to are very heavy users of the tool and have not encountered usage limitations.</p><h4 id="fintech-company-us-series-d-8000-people-staff-engineer">Fintech company, US, Series D, ~8,000 people. Staff engineer:</h4><blockquote>&#x201C;The cost in token spend is off the charts &#x2013; and leadership has shared this trend with us. They have not said anything beyond showing growth in spend, and mentioning that this won&#x2019;t be sustainable. So, nothing specific yet, but my sense is that something will have to change. Limits or prioritizing cheaper models, cutting back on hiring? Who knows.&#x201D;</blockquote><h4 id="infra-company-us-publicly-traded-5000-people-engineering-director">Infra company, US, publicly traded, ~5,000 people. Engineering Director:</h4><blockquote><strong>&#x201C;We&#x2019;re monitoring but not restricting.</strong> We are spot checking the heaviest users, but we are seeing the business cases working out.<br><br>We are offering some guidance on model selection - e.g., turn off the new high-effort setting in Claude. Some users are trying open source models &#x2013; but open source model usage is a bottom-up initiative, not a top-down one.&#x201D;</blockquote><h4 id="information-technology-us-10000-people-director-of-engineering">Information technology, US, 10,000+ people. Director of Engineering:</h4><blockquote>&#x201C;We have already had to raise our API budget limits multiple times in April. We recently switched to a much higher-effort level for Claude, which significantly increased the cost per PR.<br><br><strong>One reason for the cost spike is using state-of-the-art models for demanding tasks.</strong> We are using that high-effort setting even for fairly trivial tasks that could have been handled by much cheaper models, or even by lower-effort Claude loops. Despite a few of us pointing this out, leadership has basically said budget is not the concern right now.<br><br>I sense that the budget increase has not been forecasted, and we&#x2019;re in for a reckoning.<strong> </strong>I suspect the attitude changes once finance and other cost-conscious parts of the org realize we are spending hundreds of dollars per day, per highly-engaged developer. For now, fear of missing out and not wanting to fall behind seems to be outweighing cost discipline.&#x201D;</blockquote><h4 id="games-studio-useurope-5000-people-senior-developer">Games studio, US+Europe, ~5,000 people. Senior developer:</h4><blockquote>&#x201C;What budget increase? It&#x2019;s very hard to get a budget for AI here! Claude Code is still not rolled out because $200/month/dev is seen as too high a cost. I talk with people at startups where $1,000/month in spending is totally normal, and it&#x2019;s night and day here.&#x201D;</blockquote><h4 id="fintech-company-useurope-late-stage-5000-people-staff-engineer">Fintech company, US+Europe, late stage, ~5,000 people. Staff engineer:</h4><blockquote><strong>&#x201C;Some developers are now spending $500 a day (!!) on Claude Code.</strong> Practically speaking, this means that employee costs have doubled. Productivity has increased, in my view, but now the bottleneck is code reviews. AI can spit out code quite quickly, but we still have human reviews in place. Leadership encourages using AI for code review, but my team will not blindly trust AI.<br><br>The push from AI is coming from the top. This year&#x2019;s performance review had a section on AI, rating devs by how well they used AI, so this is another reason everyone just uses it as much as they can.&#x201D;</blockquote><h2 id="mid-sized-companies">Mid-sized companies</h2><h4 id="saas-industry-us-2000-people-dev-productivity-lead">SaaS industry, US, ~2,000 people. Dev Productivity Lead:</h4><blockquote>&#x201C;Model routing helped keep our costs growing less dramatically. For example, changing the default model reduced cost by 30%. This is our strategy with AI spend, summarized:<strong>Short term: spend, spend, spend!</strong> Experiment and use whatever models make sense.<strong>Measure the impact</strong>. Measure key outcomes and report on spend, monthly.<strong>When spend vs results diverge: adjust. </strong>When our spend increases dramatically, but outcomes don&#x2019;t follow: see what we can do to adjust the delta. More spend should mean better outcomes. If not, we are doing something wrong.&#x201D;</blockquote><h4 id="finance-industry-us-2000-people-vp-of-ai">Finance industry, US, ~2,000 people. VP of AI:</h4><blockquote>&#x201C;We have Cursor and Claude Desktop, both of which have around 800-1,200 total users. Token usage is growing somewhat unexpectedly. Estimates are being adjusted on the fly; the initial plan to have strict limits (say, $100 per user) is breaking when reality hits, and people exhaust them in 3-5 working days.<br><br>Using expensive models is a problem. In regards to Cursor, many devs are defaulting to the most expensive models without realizing that going with Opus gives single percentage gains in intelligence compared to Sonnet, for example, while exhausting their budgets almost immediately.<br><br><strong>We are working on blocking/managing out the most expensive models [with Cursor]</strong>, as going into thousands of dollars per user, per month is not sustainable on our scale. Cursor is a good partner and we&#x2019;re working with them to switch to a &#x201C;pooled spend&#x201D; model where heavy users can tap into a pool of extra spend.<br><br>Claude is a similar story. We were at $100 of Claude Desktop limit for everyone, but as we are moving forward, I can see that we would need to go much higher, especially for business-critical use cases.&#x201D;</blockquote><h4 id="infra-company-us-late-stage-700-people-founder">Infra company, US, late-stage, ~700 people. Founder:</h4><blockquote>&#x201C;We haven&#x2019;t had much of an issue. Most folks police themselves for runaway costs; for example, we had someone hit like $10K in a week because they messed up caching, but it was caught and they corrected their harness.<br><br><strong>For the most part, we don&#x2019;t see our high-end folks spending more than ~$1K/week.</strong> Now, to be clear, this is not a small amount! BUT it&#x2019;s already a small subset of the population.<br><br>We&#x2019;re just factoring it into engineering costs at this point: if it&#x2019;s, say, $2K/month per employee, that&#x2019;s $24K per year.<br><br>Who cares, then, when engineers already cost $200-400K/year in cash comp? Okay, so what if it&#x2019;s $5K/month. That&#x2019;s $60K/year.<br><br><strong>Our bet is that token costs will stabilize and we&#x2019;ll eventually end up with local-ish models.</strong><br><br>Now, it could be five years before they stabilize, but overall, spend today isn&#x2019;t that insane to me.<br><br>There&#x2019;s a lot of people who are just dumb about it, but most legit execs push back on this. Take the <a href="https://newsletter.pragmaticengineer.com/i/183931240/ralph-mania?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Ralph loops</a> or other insanity where someone spends $1K/day, $5K/week or stuff like this. That&#x2019;s all just people being fools thinking they&#x2019;re doing &#x201C;R&amp;D,&#x201D; or somehow that they&#x2019;re smarter than everyone else, but they&#x2019;re just producing junk that never ships or is not useful.<br><br><strong>We saw a bit of &#x201C;stupid overspend&#x201D; in the first couple months, but that&#x2019;s all gone now. </strong>Costs could go up even more if we would &#x201C;crack the whip&#x201D; in wanting to see even more output, but we&#x2019;re not doing that.&#x201D;</blockquote><h4 id="healthcare-industry-us-500-people-senior-engineering-manager">Healthcare industry, US, ~500 people. Senior engineering manager:</h4><blockquote>&#x201C;<strong>We are not holding back on spend, and have a monthly spend leaderboard.</strong> And we WANT devs to spend more on tokens! For example, one of my engineers spent $1,400 on a long Claude Code session in a single day.<br><br><strong>We are seeing massive leverage, and we do more with the same number of people. </strong>This is why we are okay with our spending spiking. Our traffic is growing more than 10x, year-on-year, and we have managed to keep things running with the same team, and these AI tools.<br><br>Engineering is now blocked on Product and Design &#x2013; which never happened before! This is how fast execution has become. We now have Staff+ engineers writing Product PRDs so we can move faster.<br><br>I&#x2019;ve been in tech for close to 15 years and I never saw dramatic change like this. I just came back after a 3-month break, and every single thing is different in my day! I feel these AI agents are the biggest change in the industry since high-level languages became widespread.&#x201D;</blockquote><h4 id="e-commerce-company-us-europe-2000-devs-head-of-engineering">E-commerce company, US &amp; Europe, ~2,000 devs. Head of Engineering:</h4><blockquote>&#x201C;The increase in spend is INSANE. It&#x2019;s about usage going up, with no signs of stopping. Usage is off the charts.<br><br>We currently do not have limits in place, and are not pausing now. Our CEO is AI-pilled and won&#x2019;t let us slow down.<br><br><strong>We do buy tokens at a discount. </strong>They start from 5% and go up with usage with the vendors we use (the usual suspects.)<br><br>We don&#x2019;t let devs use anything lower than Opus 4.7 for coding. Cheaper models might work better, but a slight error pushed to prod would result in hours of toil.&#x201D;</blockquote><h2 id="small-companies">Small companies</h2><h4 id="series-a-us-50-people-principal-engineer">Series A, US, ~50 people. Principal Engineer:</h4><blockquote>&#x201C;About 15 devs are heavy users of AI and costs are rising very fast. Almost everyone uses Claude and Claude Code. We are considering four potential options:<strong>Increase AI budget, and start measuring more</strong>. Continue doing what we are, but allow devs to use more tokens instead of hiring limits. The precise ROI is hard to quantify, but we&#x2019;ll start to measure and track both AI adoption and impact.<strong>Optimize token consumption. </strong>Use cheaper models for simpler tasks, review token usage, and see where we can cut usage. Downside: this approach could become one with diminishing returns, fast.<strong>Integrate more AI providers in the company.</strong> Find wrappers to abstract LLMs. The problem is: how do you replace Claude Code, for instance?<strong>Pivot to local models:</strong> such as Kimi, Qwen, and so on. The problem is it&#x2019;s a big investment in high-end hardware or cloud GPUs. Upside: it offers better long-term cost control, once done.<br><br>We are likely to go with option #1: increase spend BUT maintain momentum and put the right measurements in place. We can do #2, #3 and #4 later. But if we kill AI usage momentum inside the company, the outcome will probably be worse.&#x201D;</blockquote><h4 id="ai-infra-us-seed-stage-15-people-founder">AI infra, US, seed stage, ~15 people. Founder:</h4><blockquote>&#x201C;<strong>We saw a 15x increase in 6 months:</strong>Six months ago our spend per developer was ~$200/monthToday, it&#x2019;s around $3,000/developer/month, for our seven devs<br>We&#x2019;re not slowing usage, especially as we are building an AI infra product. The increase was much faster than expected, though.&#x201D;</blockquote><h4 id="small-bootstrapped-company-europe-founding-engineer">Small, bootstrapped company, Europe. Founding engineer:</h4><blockquote>&#x201C;Our current strategy in dealing with the increase in costs is to switch to a cheaper model; unfortunately, from Opus to Sonnet in our case. That said, Sonnet is quite decent.&#x201D;</blockquote><h3 id="how-businesses-manage-token-spend">How businesses manage token spend</h3><p>Regardless of company size, there seems to be two strategies for how companies deal with increased spending. A summary:</p><p><strong>Strategy #1: &#x201C;let it rip and start measuring.&#x201D; </strong>Around half of respondents say AI spend is rising dramatically, and they have decided to do nothing about it. They <em>want</em> devs to use AI as much as it makes sense to, and to help the work as much as possible.</p><p>However, because the cost is rising dramatically, these companies are now starting to measure usage and attempting to measure the impact of their AI tools.</p><p>There&#x2019;s a few companies where the impact seems to be very positive, already. Smaller startups whose business is exploding in numbers of customers, load, and revenue, see that they don&#x2019;t need to hire more staff because existing engineers can keep supporting the growth with AI tools.</p><p><strong>Strategy #2: curb spending.</strong> Commonly mentioned cost-saving approaches:</p><ul><li>Use cheaper models for simpler tasks</li><li>Set default models to less capable ones</li><li>Set a spending cap and make it hard for engineers to exceed it, or require consent for doing so</li></ul><p>Most companies using strategy #1 have briefly considered going with this approach, but threw it away, because they see this approach as optimizing on the wrong thing: cutting costs before the productivity impact of using state-of-the-art tools is even known!</p><p><strong>Discounts exist when the spend is in the millions of dollars. </strong>I asked several people if they are getting discounts from vendors when buying tokens at scale. There were no exact numbers, but this is what I gathered in aggregate about possible custom agreements:</p><ul><li><strong>Cursor: open to discounts above a few million dollars in spend. </strong>Companies have negotiated discounts with Cursor after crossing $1M of spending. Some companies negotiated tiered discounts from this level, starting at 5% and going higher as their spend goes up.</li><li><strong>Anthropic: no discounts. </strong>I talked with companies spending $5M+ per year on Claude which have received no discounts. If Anthropic offers discounts, it will likely be at a much higher tier.</li><li><strong>All discounts are custom, so try to negotiate &#x2013; it&#x2019;s free! </strong>Pricing discounts are on a per-customer basis, and highly custom. The easiest way to see if a discount is available is to ask the vendors!</li></ul><p><em>&#x2014;-</em></p><p><em>Read the full issue of </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-ai-token-spending-out-of?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em>, or check out </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-github-breaks?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>this week&#x2019;s The Pulse</em></a><em>. This week&#x2019;s issue covers:</em></p><ol><li><strong>Load from AI breaks GitHub &#x2013; but why not other vendors? </strong>GitHub&#x2019;s reliability is less than one nine, and getting worse. Prolific open source contributor, Mitchell Hashimoto, is quitting GitHub because he thinks it&#x2019;s not suited for professional work. GitHub&#x2019;s leadership blames the 3.5x increase in service load as the cause of degradation &#x2013; or it might be self-inflicted.</li><li><strong>Anthropic&#x2019;s speedrun to destroy trust.</strong> Anthropic could do no wrong until recently, but in the past month, that&#x2019;s all changed. Silently nerfing Claude Code, banning companies from Claude, and baffling price rises all add to a sense that Anthropic is in its &#x201C;extraction&#x201D; era of generating more revenue for the same or worse service.</li><li><strong>Industry pulse. </strong>Dramatic price increases at GitHub Copilot, explosive growth at Codex, Google scrambling to build a good coding model, Cursor might be bought by SpaceX, AI agent deletes car business, and more.</li><li><strong>Mitchell Hashimoto &amp; the &#x201C;building block economy</strong>.<strong>&#x201D; </strong>Ghostty&#x2019;s creator finds that open source &#x201C;building blocks&#x201D; are the best way to win massive adoption by software components &#x2013; but it&#x2019;s got harder to build a business on top of open building blocks.</li></ol>]]></content:encoded></item><item><title><![CDATA[The Pulse: ‘Tokenmaxxing’ as a weird new trend]]></title><description><![CDATA[At Meta, Microsoft, Salesforce and other large companies, devs are purposefully burning tokens (and money!) to inflate their AI usage and hit AI usage metrics which they treat as targets.]]></description><link>https://blog.pragmaticengineer.com/the-pulse-tokenmaxxing-as-a-weird-new-trend/</link><guid isPermaLink="false">69ea4ef45d681300012e37e5</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 23 Apr 2026 16:55:40 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-tokenmaxxing-as-a-weird?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the article below seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em> subscribe here</em></a><em>.</em></p><p>Inside Meta, an engineer created a &#x201C;token leaderboard&#x201D; that ranks employees by token usage. Last week, The Information <a href="https://www.theinformation.com/articles/meta-employees-vie-ai-token-legend-status?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reported</a>:</p><blockquote>&#x201C;Employees at Meta Platforms who want to show off their AI superuser chops are competing on an internal leaderboard for status as a &#x201C;Session Immortal&#x201D;&#x2014; or, even better, &#x201C;Token Legend.&#x201D;<br><br>The rankings, set up by a Meta employee on its intranet using company data, measure how many tokens &#x2014; the units of data processed by AI models &#x2014; employees are burning through. Dubbed &#x201C;Claudeonomics&#x201D; after the flagship product of AI startup Anthropic, the leaderboard aggregates AI usage from more than 85,000 Meta employees, listing the top 250 power users.<br><br>The practice is emblematic of Silicon Valley&#x2019;s newest form of conspicuous consumption, known as &#x201C;tokenmaxxing,&#x201D; which has turned token usage into a benchmark for productivity and a competitive measure of who is most AI native. Workers are maximizing their prompts, coding sessions and the number of agents working in parallel to climb internal rankings at Meta and other companies and demonstrate their value as AI automates functions such as coding.&#x201D;</blockquote><p>I spoke with a few engineers at Meta about what&#x2019;s happening, and this is what they said:</p><ul><li><strong>Massive waste. </strong>Plenty of devs are running an OpenClaw-like internal agent that burns massive amounts of tokens for little to no outcome.</li><li><strong>Outages caused by AI overuse. </strong>A dev mentioned that some SEVs were caused by what looked like careless AI code generation; almost like a dev behind the SEV was more concerned with churning out massive amounts of code with AI than with product quality.</li><li><strong>Gamified leaderboard. </strong>Those at the top of the leaderboard produce throwaway, wasteful work. This is painfully clear to anyone who checks Trajectories (AI prompts), which can be viewed.</li></ul><p>As per The Information, Meta employees used a total of 60.2 trillion AI tokens (!!) in 30 days. If this was charged at Anthropic&#x2019;s API prices, it would cost $900M. Of course, Meta is likely purchasing tokens at a discount, but that could still come in at $100M+ &#x2013; in large part from senseless &#x201C;tokenmaxxing&#x201D;.</p><p><strong>After backlash on social media, Meta abolished the internal leaderboard last week. </strong>One day after The Information revealed details about the incredible tokenmaxxing numbers, I confirmed that Meta has taken down its leaderboard; perhaps they realized that the incentive created enormous and unnecessary waste. If so, it&#x2019;s a bit surprising that it took media coverage for the social media giant to reach that conclusion.</p><p><strong>One engineer at Meta told me they think Meta had a different goal with the token leaderboard. </strong>A long-tenured engineer suspects increasing AI usage actually was the real goal. They said:</p><blockquote>&#x201C;Putting a leaderboard in place was always going to incentivize much more AI usage. And more AI usage means producing a lot more real-world traces. These traces can then be used to train Meta&#x2019;s next-generation coding model better.<br><br>I believe this was the goal, even if no one said it out loud.<br><br>It&#x2019;s an expensive way to generate data for training, but if any company has the means to do so, it&#x2019;s Meta.&#x201D;</blockquote><h3 id="microsoft-full-force-tokenmaxxing"><strong>Microsoft: full-force tokenmaxxing</strong></h3><p>Similarly, Microsoft has had an internal token leaderboard like Meta&#x2019;s since January, and it started pretty well, as I reported back at the time: there&#x2019;s an internal token dashboard that displays the individuals who use the most tokens in order to promote the use of tokens and experimentation with LLMs. At the Windows maker, this leaderboard is interesting:</p><ul><li>Very senior engineers &#x2013; distinguished-level folks &#x2013; are in the top 5 across the whole company, despite the fact that this group generally wrote little code in the past.</li><li>VP-level folks make the top 10 and top 20, despite often being in meetings for most of the day and rarely writing code.</li></ul><p>However, what starts as a metric for performance reviews or promotions can quickly become a target for devs. I talked with a software engineer at the Windows maker who admitted they&#x2019;re full-on &#x201C;tokenmaxxing&#x201D; &#x2013; not to get on the leaderboard, but rather because they don&#x2019;t want to be seen as using too few tokens:</p><blockquote>&#x201C;We have internal dashboards and metrics tracking AI usage, token usage, percentage of code written by AI vs hand-written code.<br><br>I am conscious of not wanting to be seen as &#x201C;uses too little AI,&#x201D; and I&#x2019;m not ashamed to say I need to do tokenmaxxing to do this. Things I do to inflate my token usage metrics:Ask AI questions about the code already in the documentation. The AI pulls up the documentation, processes it, and gives me results 10x slower, but while burning lots of tokens. I could use &#x201C;readthedocs&#x201D; [an internal product], but then my token numbers would be lowerAsk the AI to prototype a feature that I have no intention of working on. Prompt it a few more times, then throw the whole thing awayDefault to always using the agent, even when I know I could do the work by hand much faster. Then watch it fail&#x201D;</blockquote><p>This engineer is relatively new at the company, so is concerned about job security, and is playing this game to avoid being tagged as insufficiently &#x201C;AI-native&#x201D; by burning far more tokens than necessary.</p><h3 id="salesforce-burning-tokens-to-hit-%E2%80%9Cminimum%E2%80%9D-%E2%80%9Cideal%E2%80%9D-targets"><strong>Salesforce: burning tokens to hit &#x201C;minimum&#x201D; &amp; &#x201C;ideal&#x201D; targets</strong></h3><p>Elsewhere, Salesforce has created &#x201C;tokenmaxxing&#x201D; incentives, as well.<strong> </strong>Talking with an engineer there, I learned that the company built two tools that effectively incentivize excessive spending on tokens:</p><ol><li><strong>&#x201C;Minimum&#x201D; incentives with a tracking tool.</strong> There&#x2019;s a Mac widget that shows your own spend, updated every 15 minutes. It also displays minimum expected spend. Last week, the target was $100 on Claude Code, and $70 on Cursor.</li><li><strong>Showing everyone&#x2019;s spend. </strong>A web-based tool to see the token spend of any colleague. It&#x2019;s used to check where team mates&#x2019; usage is at.</li><li><strong>&#x201C;Maximum&#x201D; spend limits that can be exceeded. </strong>Up to a week ago, there was also a <em>maximum</em> monthly limit of $250 for Claude Code and $170 for Cursor. <em>However, this can be exceeded with the simple press of a button if the limit is reached. I&#x2019;ve learned that last week, some engineering organisations at Salesforce had their &#x201C;maximum&#x201D; limit removed in order to &#x201C;remove any friction from the development process.&#x201D;</em></li></ol><p>The message Salesforce sends to staff is clear: &#x201C;use a minimum of $170/month tokens or be flagged.&#x201D; Who wants to get flagged for using too few tokens? The outcome is somewhat wasteful token spend:</p><ul><li><strong>Burning tokens for nothing. </strong>Devs ask Claude or Cursor: &#x201C;build me X,&#x201D; where X is a project or product with nothing to do with their work, and not something they&#x2019;d ever ship. It&#x2019;s just a way to burn tokens</li><li><strong>Calibrating token spend to be above average. </strong>Plenty of devs browse peers&#x2019; token spend to figure out the slightly-above average point, then use the tokens needed to hit that mark</li></ul><h3 id="shopify-an-example-on-how-to-avoid-tokenmaxxing"><strong>Shopify: an example on how to avoid tokenmaxxing</strong></h3><p>The first-ever token leaderboard that I&#x2019;m aware of was built by Shopify in 2025. And it worked well! Last June, the Head of Engineering at Shopify, Farhan Thawar, told me <a href="https://newsletter.pragmaticengineer.com/p/how-ai-is-changing-software-engineering?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">on The Pragmatic Engineer Podcast</a>:</p><blockquote>&#x201C;We have a leaderboard where we actively celebrate the people who use the most tokens because we want to make sure they are [celebrated] if they&#x2019;re doing great work with AI.<br><br>[And for the top people on the leaderboard,] I want to see why they spent say $1,000 a month in credits for Cursor. Maybe that&#x2019;s because they&#x2019;re building something great and they have an agent workforce underneath them!&#x201D;</blockquote><p>I asked Farhan for details on how it&#x2019;s gone since. Here&#x2019;s what he told me:</p><blockquote>&#x201C;We have since renamed the token leaderboard to usage dashboard: for obvious reasons, as we don&#x2019;t want to encourage &#x201C;competing&#x201D; to make it to the top of this board. We have token spend on our internal wiki profile as well as on the usage dashboard.<br><br><strong>We also have circuit breakers to catch &#x201C;runaway agents.&#x201D;</strong> So if personal spend spikes within a day, we can cut off access immediately, and you can renew if the usage spike was deliberate, or if it was a runaway agent. The circuit breaker worked well for us: we&#x2019;ve not only caught runaway agents, but found bugs in our infra this way!&#x201D;</blockquote><p>Shopify&#x2019;s approach seems to have worked for a few reasons:</p><ul><li><strong>The usage dashboard served as a &#x201C;push&#x201D; for devs to use AI tools, early-on. </strong>Last year, devs were mostly experimenting with AI tools because they were not as performant as today. The usage dashboard encouraged developers to try new tools, and highlighted power users.</li><li><strong>Circuit breakers helped.</strong> Cutting off spend when usage spikes helped catch &#x201C;runaway agents.&#x201D;</li><li><strong>High usage is looked at.</strong> Farhan checks-in with top-spending individuals to understand the use cases. Any tokenmaxxing would likely have been spotted at this stage, which would have been a bit embarrassing for the user!</li></ul><p>One more interesting learning Farhan shared with me: it&#x2019;s more interesting to not look at &#x201C;who spent the most in <em>overall</em> token cost?&#x201D; but instead, &#x201C;whose <em>tokens</em> cost the most?&#x201D; Devs who generate tokens that come out as expensive have turned out to do in-depth work that was interesting to learn about!</p><h3 id="tokenmaxxing-great-for-ai-vendors-bad-for-everyone-else"><strong>Tokenmaxxing: great for AI vendors, bad for everyone else</strong></h3><p>I see very few rational reasons why incentivizing tokenmaxxing makes sense for any company. It results in increasing AI spend &#x2013; by a lot! &#x2013; in return for little to no value. Heck, in some cases it actually incentivises slower work &#x2013; as shown by devs using the AI to answer questions when documentation is readily available &#x2013; and encouraging &#x2018;busywork&#x2019; where devs prompt projects that they don&#x2019;t even want to ship. Tokenmaxxing seems to push devs to focus on stuff that makes no difference to a business.</p><p>It feels to me that a good part of the industry is using token count numbers similarly to how the lines-of-code-produced metric was used years ago. There was a time when the number of lines written daily or monthly was an important metric in programmer productivity, until it became clear that it&#x2019;s a terrible thing to focus on. A lines-of-code metric can easily be gamed by writing boilerplate or throwaway code. Also, the best developers are not necessarily those who write the most code; they&#x2019;re the ones who solve hard problems for the business quickly and reliably with &#x2013; or without &#x2013; code!</p><p>Similarly, the number of tokens a dev generates can easily be gamed, and if this metric is measured then devs will indeed game it. But doing so generates a massive accompanying AI bill!</p><p><em>&#x2014;-</em></p><p><em>Read the full issue of </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-tokenmaxxing-as-a-weird?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em>, or check out </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-ai-token-spending-out-of?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>this week&#x2019;s The Pulse</em></a><em>. This week&#x2019;s issue covers:</em></p><ol><li><strong>New trend: token spend breaks budgets &#x2013; what next? </strong>In the past 2-3 months, spending on AI agents has exploded at many tech companies, and the ramifications of this are starting to dawn on engineering leaders. We&#x2019;ve sourced details from 15 companies, including the different ways they are coping with this realization.</li><li><strong>New trend: more AI vendors can&#x2019;t keep up with demand. </strong>Related to massively increased spending, GitHub Copilot and Anthropic are starting to limit less-profitable individual users, so they can serve business users whose spend has easily 10x&#x2019;d in the last few months. The exception is OpenAI and Codex.</li><li><strong>Morale at Meta hits all-time low? </strong>Business is booming but devs at Meta are furious and worried due to looming layoffs, and an invasive tracking program rolled out to all US employees.</li></ol>]]></content:encoded></item><item><title><![CDATA[The Pulse: is GitHub still best for AI-native development?]]></title><description><![CDATA[Availability has dropped to one nine (~90% – !!), partly due to not being able to handle increased traffic from AI coding agents. There’s also no CEO and an apparent lack of direction.]]></description><link>https://blog.pragmaticengineer.com/the-pulse-is-github-still-best-for-ai-native-development/</link><guid isPermaLink="false">69cfc72da95ab10001e47e0c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 03 Apr 2026 15:03:38 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-github-still-best-for?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the article below eight days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em> subscribe here</em></a><em>.</em></p><p>We&#x2019;re used to highly reliable systems which target four-nines of availability (99.99%, meaning about 52 minutes of downtime per year), and for it to be embarrassing to barely hit three nines (around 9 hours of downtime per year.) And yet, in the past month, GitHub&#x2019;s reliability is down to one nine!</p><p>Here&#x2019;s data from the third-party, &#x201C;<a href="https://mrshu.github.io/github-statuses/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">missing GitHub status page</a>&#x201D;, which was built after GitHub stopped updating its own status page due to terrible availability. Recently, things have looked poor:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image.png" class="kg-image" alt loading="lazy" width="1456" height="399" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/04/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/04/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">GitHub down at one nine. Source: </em></i><a href="https://mrshu.github.io/github-statuses/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">The Missing GitHub Status Page</em></i></a></figcaption></figure><p>This means that for every 30 days, GitHub had issues on 3 days, or issues/degradations for 2.5 hours daily (around 10% of the time.)</p><p><strong>GitHub seems unable to keep up with the massive increase in infra load from agents. </strong>One software engineer built a clever website called &#x201C;<a href="https://www.claudescode.dev/?window=90d&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Claude&#x2019;s Code</a>&#x201D; that tracks Claude Code bot contributions across GitHub. Growth in the past three months has been enormous:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image-1.png" class="kg-image" alt loading="lazy" width="1456" height="909" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/04/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/04/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/04/image-1.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Load from Claude Code has 6x&#x2019;d in 3 months. Source: </em></i><a href="https://www.claudescode.dev/?window=90d&amp;ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Claude&#x2019;s Code</em></i></a></figcaption></figure><h3 id="stream-of-github-outages-from-infra-overload">Stream of GitHub outages from infra overload</h3><p>GitHub&#x2019;s CTO, Vladimir Fedorov, addressed availability issues <a href="https://github.blog/news-insights/company-news/addressing-githubs-recent-availability-issues-2/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in a blog post</a> and covered three major incidents:</p><ul><li><a href="https://www.githubstatus.com/incidents/xwn6hjps36ty?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">2 February</a>: security policies unintentionally blocked access to virtual machine metadata</li><li><a href="https://www.githubstatus.com/incidents/lcw3tg2f6zsd?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">9 February</a>: a database cluster got overloaded</li><li><a href="https://www.githubstatus.com/incidents/g5gnt5l5hf56?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">5 March</a>: writes failed on a Redis cluster</li></ul><p>Software engineer Lori Hochstein did <a href="https://surfingcomplexity.blog/2026/03/12/quick-thoughts-on-github-ctos-post-on-availability/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a helpful analysis</a> of these outages and the CTO&#x2019;s response, and has interesting observations:</p><ul><li><strong>Saturation</strong>: the database cluster incident (9 Feb) was a case of the database getting saturated, due to higher-than-expected usage. Databases are harder to scale up than stateless services. GitHub also underestimated how much additional traffic there would be.</li><li><strong>Failover + telemetry gap</strong>: the 2 Feb incident was a combination of an infra issue in one region failing over to a healthy region, and making things worse with a telemetry gap (incorrect security policies were applied in the new regions which blocked access to VM metadata)</li><li><strong>Failover + configuration issue</strong>: the 5 March incident was uncannily similar: after a failover, a configuration issue blocked writes on a Redis cluster</li></ul><p>It is certainly nice to get details from GitHub on these outages. It feels to me that infra strains are causing more infra issues &#x2192; they trigger constraints faster &#x2192; failovers are not as smooth as they should be. Could it be because GitHub keeps changing their existing systems?</p><h3 id="startup-shows-github-how-it%E2%80%99s-done">Startup shows GitHub how it&#x2019;s done</h3><p>While GitHub struggles to keep up with the increase in load from AI agents generating more code and pull requests, a new startup called Pierre Computer claims to have built an &#x201C;AI-native&#x201D; solution for AI agents pushing code, which scales far beyond what GitHub can do. Pierre was founded by <a href="https://www.linkedin.com/in/jacob-thornton-13a6a5162/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Jacob Thornton</a>: formerly an engineer at Coinbase, Medium, and Twitter, and also the creator of the once-very popular <a href="https://getbootstrap.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Bootstrap</a> CSS library.</p><p>Here&#x2019;s what Pierre supports, which GitHub does not:</p><blockquote>&#x201C;In October [2025], Github shared they were averaging ~230 new repos per minute.<br><br>Last week we [at Pierre Computer] hit a sustained peak of &gt; 15,000 repos per minute for 3 hours.<br><br>And in the last 30 days customers have created &gt; 9M repos&#x201D;</blockquote><p>These are incredible numbers &#x2013; if also self-reported &#x2013; and something that GitHub clearly cannot get close to, at least not today! There are few details about customers, while the product &#x2013; called <a href="https://code.storage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Code.storage</a> &#x2013; seems to be in closed beta.</p><p>Still, this is the type of &#x201C;git for AI agents&#x201D; that GitHub has failed to build, and the type of infrastructure it needs badly.</p><h3 id="has-github-lost-focus-and-purpose">Has GitHub lost focus and purpose?</h3><p>GitHub&#x2019;s reliability issues are acute enough that, if it keeps up, teams will start giving alternatives like small startups such as Pierre a try, or perhaps even consider self-hosting Git. But how did the largest Git host in the world neglect its customers, and fail to prepare its infra for an increase in code commits and pull requests?</p><p>Mitchell Hashimoto, founder of Ghostty, and a heavy user of GitHub himself, had advice on what he would do if he was in charge of GitHub, after growing frustrated with the state of its core offering. He <a href="https://x.com/mitchellh/status/2036866220449030168?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">writes</a> (emphasis mine)</p><blockquote>&#x201C;Here&#x2019;s what I&#x2019;d do if I was in charge of GitHub, in order:<br><br><strong>1. Establish a North Star plan around being critical infrastructure for agentic code</strong> lifecycles and determine a set of ways to measure that.<br><br><strong>2. Fire everyone who works on or advocates for Copilot and shut it down.</strong> It&#x2019;s not about the people, I&#x2019;m sure there&#x2019;s many talented people; you&#x2019;re just working at the wrong company.<br><br><strong>3. Buy Pierre and launch agentic repo hosting as the first agentic product.</strong> Repos would be separate from the legacy web product to start, since they&#x2019;re likely burdened with legacy cross product interactions.<br><br><strong>4. Re-evaluate all product lines and initiatives against the new North Star. </strong>I suspect 50% get cut (to make room for different ones).<br><br>The big idea is all agentic interactions should critically rely on GitHub APIs. Code review should be agentic but the labs should be building that into GH (not bolted in through GHA like today, real first class platform primitives). GH should absolutely launch an agent chat primitive, agent mailboxes are obviously good. GH should be a platform and not an agent itself.<br><br>This is going to be very obviously lacking since I only have external ideas to work off of and have no idea how GitHub internals are working, what their KPIs are or what North Star they define, etc.<br><br>But, with imperfect information, this is what I&#x2019;d do.&#x201D;</blockquote><p>My sense is that GitHub has three concurrent problems:</p><ul><li><strong>GitHub and Copilot are entangled with Microsoft&#x2019;s internal politics. </strong>GitHub&#x2019;s Copilot in 2021 was the first massively successful &#x201C;AI product.&#x201D; Microsoft took the &#x201C;Copilot&#x201D; brand and used it across all of their product lines, creating low-quality AI integrations. Simultaneously, internal Microsoft orgs like Azure and Microsoft AI were trying to get their hands on GitHub, which is one of the most positive developer brands at Microsoft.</li><li><strong>GitHub has no leader, seemingly by design. </strong>GitHub&#x2019;s last CEO was Thomas Dohmke, who stepped down voluntarily, and Microsoft never backfilled the CEO role; instead carrying out a reorg to make GitHub part of Microsoft&#x2019;s AI group and stripping its independence. It seems the &#x201C;Microsoft AI&#x201D; side won that battle.</li><li><strong>GitHub has no focus, and is stuck chasing Copilot as a revenue source. </strong>GitHub has no CEO and is caught up in internal politics, so, what can GitHub teams do? The safest bet is to increase revenue and the best way to do that is by investing more into GitHub Copilot, and ignoring long-term issues like reliability.</li></ul><p>I agree with Mitchell: GitHub has no &#x201C;North Star&#x201D; and we see a large org being dysfunctional. That lack of vision &#x2013; and CEO &#x2013; is hitting hard:</p><ul><li>GitHub Copilot went from the most-used AI agent in 2021, to be overtaken by Claude Code, and is soon to be overtaken by Cursor.</li><li>As a platform, GitHub has no vision for how to evolve to support AI agents. Sure, GitHub has an MCP server, but it has no &#x201C;AI-native git platform&#x201D; that can handle the massive load AI agents generate.</li><li>GitHub keeps shipping small features and improvements without direction. For example, in October 2025, they <a href="https://x.com/jaredpalmer/status/1980619222918262842?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">started to work on</a> stacked diffs. However, when it ships, the stacked diffs workflow might be mostly obsolete &#x2013; at least with AI agents!</li></ul><p>It&#x2019;s easy to win a market when you do one thing better than anyone else in the world. Right now, GitHub is doing too many things and doing a subpar job with Copilot, its platform, and AI infra.</p><hr><p>Read the full issue of <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-github-still-best-for?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">last week&#x2019;s The Pulse</a>, or check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-industry-leaders-return?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">this week&#x2019;s The Pulse</a>.</p><p>Catch up with recent The Pragmatic Engineer issues:</p><ul><li><a href="https://newsletter.pragmaticengineer.com/p/scaling-uber-with-thuan-pham-ubers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Scaling Uber with Thuan Pham</strong></a> (Uber&#x2019;s first CTO &#x2014; podcast). We went into topics like scaling Uber from constant outages to global infrastructure, the shift to microservices and platform teams, and how AI is reshaping engineering.</li><li><a href="https://newsletter.pragmaticengineer.com/p/building-whatsapp-with-jean-lee?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Building WhatsApp with Jean Lee</strong></a> (podcast): Jean Lee, engineer #19 at WhatsApp, on scaling the app with a tiny team, the Facebook acquisition, and what it reveals about the future of engineering.</li><li><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-what-will-the-staff-engineer?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>What will the Staff Engineer role look like in 2027 and beyond</strong></a><strong>?</strong> What happens to the Staff engineer role when agents write more code? Actually, they could be more in demand than ever!</li></ul>]]></content:encoded></item><item><title><![CDATA[Is the FDE role becoming less desirable?]]></title><description><![CDATA[Job postings for Forward Deployed Engineers (FDEs) have surged, but many professionals don’t want the role because it’s more like solutions engineering than software development.]]></description><link>https://blog.pragmaticengineer.com/is-the-fde-role-becoming-less-desirable/</link><guid isPermaLink="false">69c5918c3f13830001776a97</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 27 Mar 2026 10:29:33 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-the-fde-role-becoming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the article below seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em> subscribe here</em></a><em>.</em></p><p>An interesting trend highlighted <a href="https://www.wsj.com/cio-journal/the-hottest-job-in-tech-isnt-very-glamorous-dc29ab3e?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">by The Wall Street Journal</a>: companies want to hire for FDE roles, but devs are just not that interested:</p><blockquote>&#x201C;Job postings on Indeed grew more than 10-fold in 2025 compared with 2024. The number of public company transcripts mentioning the role jumped to 50 from eight over the same period, according to data from AlphaSense.<br><br>The only problem? Few engineers want the job, which has historically been seen as demanding, undesirable, and less prestigious than product-focused engineering roles.<br><br>&#x201C;Everyone wants them and there&#x2019;s only maybe 10% of the market that wants that role,&#x201D; said Patrick Kellenberger, president and chief operating officer at Betts Recruiting.&#x201C;</blockquote><p>Last summer, we covered <a href="https://newsletter.pragmaticengineer.com/p/forward-deployed-engineers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the rise of the FDE role</a>, and looked into what it&#x2019;s like. Back then, this is how I visualized what was then a very hot role:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-3.png" class="kg-image" alt loading="lazy" width="1280" height="798" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-3.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-3.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-3.png 1280w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">My 2025 visualization of the FDE role</em></i></figcaption></figure><p>At the companies where I interviewed FDE folks &#x2013; OpenAI and Ramp &#x2013; the role seemed to live up to this visualization. However, I&#x2019;ve since talked with two engineers who took FDE roles and were disappointed. This is how they saw it, in practice:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-4.png" class="kg-image" alt loading="lazy" width="1400" height="1094" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-4.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-4.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-4.png 1400w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Reality of the FDE role: less software engineering, and even less platform engineering</em></i></figcaption></figure><p>The role seems akin to a &#x201C;sales engineer&#x201D; where FDEs help close the deals, or a solutions engineer (or even consultant), where FDEs deploy to a customer to build them a solution. They don&#x2019;t contribute back into the platform, and don&#x2019;t do much that&#x2019;s considered &#x201C;software engineering&#x201D; beyond integrating software which the product team built.</p><p>Some engineers figure out the nature of the role during the interview process and pass on it. Meanwhile, some others take the job and later quit. Here&#x2019;s what a dev told me who accept an FDE role at a company, but didn&#x2019;t find what they expected:</p><blockquote>&#x201C;This FDE job was a typical IT services mindset. The company wanted to use me more on the engagement lead side, and nothing on software development. It&#x2019;s not what I signed up for, and I didn&#x2019;t like the vibe and culture. I quit 4 weeks later.&#x201D;</blockquote><p>In today&#x2019;s job market, if there&#x2019;s high demand for a role which pays decently but attracts little interest from engineers, there&#x2019;s always a reason!</p><hr><p>Read the full issue of <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-the-fde-role-becoming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">last week&#x2019;s The Pulse</a>, or check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-is-github-still-best-for?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">this week&#x2019;s The Pulse</a>.</p><p>Catch up with recent The Pragmatic Engineer issues:</p><ul><li><a href="https://newsletter.pragmaticengineer.com/p/building-whatsapp-with-jean-lee?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Building WhatsApp with Jean Lee</strong></a> (podcast): Jean Lee, engineer #19 at WhatsApp, on scaling the app with a tiny team, the Facebook acquisition, and what it reveals about the future of engineering.</li><li><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-what-will-the-staff-engineer?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>The Pulse: What will the Staff Engineer role look like in 2027 and beyond?</strong></a><strong> </strong>What happens to the Staff engineer role when agents write more code? Actually, they could be more in demand than ever!</li><li><a href="https://newsletter.pragmaticengineer.com/p/from-ides-to-ai-agents-with-steve?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>From IDEs to AI Agents with Steve Yegge (podcast):</strong></a> Steve Yegge on how AI is reshaping software engineering, the rise of &#x201C;vibe coding,&#x201D; and why developers must adapt to a rapidly changing craft.</li></ul>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare rewrites Next.js as AI rewrites commercial open source]]></title><description><![CDATA[An engineer at Cloudflare rewrote most of Vercel’s Next.js in one week with AI agents. It looks like a sign of how AI will disrupt existing moats and business models. Analysis]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflare-rewrites-next-js-as-ai-rewrites-commercial-open-source/</link><guid isPermaLink="false">69a9c3bb4c4eb80001b25ced</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 05 Mar 2026 18:03:16 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>Pragmatic Engineer</em></a><em>. This issue is the </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-164-nextjs?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>entire The Pulse issue</em></a><em> from the past week, which paying subscribers received seven days ago. This piece generated </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-164-nextjs/comments?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>quite a few comments across subscribers</em></a><em>, and so I&apos;m sharing it more broadly, especially as it raises questions on what is defensible and what is not with open source.</em></p><p><em>If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em> <u>subscribe here</u></em></a><em> to get issues like this in your inbox.</em></p><p>Today&#x2019;s issue of The Pulse focuses on a single event because it&#x2019;s a significant one with major potential ripple effects. On Tuesday, Cloudflare shocked the dev world by announcing that they have rewritten&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>&#xA0;in just one week, with a single developer who used only $1,100 in tokens:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image.png" class="kg-image" alt loading="lazy" width="1186" height="1342" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image.png 1186w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare CTO Dane Knecht&#xA0;</em></i><a href="https://x.com/dok2001/status/2026386974580330830?s=20&amp;ref=blog.pragmaticengineer.com" rel><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>There are several layers to dig into here:</p><ol><li><strong>The Next.js ecosystem: a recap</strong>. Close to half of React devs use Next.js, and the best place to deploy Next.js is on Vercel &#x2013; partly thanks to its proprietary build output.</li><li><strong>What Cloudflare did with Next.js</strong>. Replacing the build engine in Next.js with the more standard Vite one, allowing Next.js apps to be easily deployed on Cloudflare.</li><li><strong>AI brings the impossible within reach</strong>. What would take years in engineering terms was executed in one week with some tokens.</li><li><strong>&#x201C;AI slop&#x201D; still an issue.</strong>&#xA0;Contrary to Cloudflare&#x2019;s claims, vinext is not production-ready, and will need plenty of cleanup and auditing to make it on par with Next.js.</li></ol><h2 id="1-the-nextjs-ecosystem-a-recap"><br>1. The Next.js ecosystem: a recap</h2><p>First, some background.&#xA0;<a href="https://nextjs.org/?ref=blog.pragmaticengineer.com">Next.js</a>&#xA0;is the most popular fullstack React framework and around half of all React devs use it, as per recent research such as the 2025 Stack Overflow developer survey. Next.js is an open source project, built and mostly maintained by Vercel, which is the preferred deployment target for Next.js applications for many reasons. One of them is that Next.js is ideal to deploy to Vercel because Next.js applications are built with Vercel&#x2019;s Turbopack build tool. The output of a build is a proprietary format. As Netlify engineer Eduardo Bou&#xE7;as&#xA0;<a href="https://eduardoboucas.com/posts/2025-03-25-you-should-know-this-before-choosing-nextjs/?ref=blog.pragmaticengineer.com">writes</a>:</p><blockquote>&#x201C;The output of a Next.js build has a proprietary and undocumented format that is used in Vercel deployments to provision the infrastructure needed to power the application.<br><br>This means that any hosting providers other than Vercel must build on top of undocumented APIs that can introduce unannounced breaking changes in minor or patch releases. (And they have)&#x201D;.</blockquote><p>Next.js is an interestingly built project, where everything is open source, and the best place to deploy a Next.js application is on Vercel, as it&#x2019;s optimized to run undocumented build artifacts the most efficiently. This is a smart strategy from Vercel which competitors will dislike, as any hosting provider would prefer Next.js to produce a standard build format. To do this, the build engine, Turbopack, would need to be replaced with something more standard.</p><p><strong>Let&#x2019;s talk about build tools for web development.&#xA0;</strong>According to the&#xA0;<a href="https://2025.stateofjs.com/en-US/libraries/?ref=blog.pragmaticengineer.com">State of JS 2025 survey</a>, the most popular in the web ecosystem are:</p><ol><li><a href="https://vite.dev/?ref=blog.pragmaticengineer.com"><strong>Vite</strong></a>: the most popular choice for new projects due to its speed and developer experience. Uses projects like&#xA0;<a href="https://esbuild.github.io/?ref=blog.pragmaticengineer.com">esbuild</a>&#xA0;and&#xA0;<a href="https://rollupjs.org/?ref=blog.pragmaticengineer.com">Rollup</a>&#xA0;under the hood</li><li><a href="https://webpack.js.org/?ref=blog.pragmaticengineer.com"><strong>Webpack</strong></a>: a legacy tool that&#x2019;s not very performant, but still widely deployed in older projects</li><li><a href="https://nextjs.org/docs/app/api-reference/turbopack?ref=blog.pragmaticengineer.com"><strong>Turbopack</strong></a>: Created by Vercel and optimized for larger&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>&#xA0;applications. Built in Rust and intended to be more performant</li><li><a href="https://bun.com/?ref=blog.pragmaticengineer.com"><strong>Bun</strong></a>: a relatively new, all-in-one runtime and bundler. Anthropic acquired the team&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/180722007/anthropic-acquires-javascript-runtime-bun?ref=blog.pragmaticengineer.com">in December</a>, and some Bun folks are now focused on improving Claude Code&#x2019;s performance.</li></ol><p>So, most of the web ecosystem uses Vite as a build tool; Next.js uses Turbopack, and the majority of React applications with a full-stack React framework use Next.js. Basically, most devs using Next.js are likely to use Vite as their build tool.</p><h2 id="2-what-cloudflare-did-with-nextjs"><br>2. What Cloudflare did with Next.js</h2><p>Here&#x2019;s a naive idea: what if Next.js used Vite to generate build outputs? In that case, build outputs would be standardized and would run equally well on any cloud provider, as there would be nothing proprietary or undocumented to Vercel.</p><p>And this is what Cloudflare did: replace Turbopack with Vite and call the new package &#x2018;vinext&#x2019;:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-1.png" class="kg-image" alt loading="lazy" width="1442" height="1024" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-1.png 1442w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare replaced the Turbopack build dependency with Vite to create vinext</em></i></figcaption></figure><p>Buried midway in the announcement is how this project&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com#status-experimental">is experimental</a>&#xA0;and not at all guaranteed to work okay: it&#x2019;s a &#x2018;use-at-own-risk&#x2019; project. Still, the mere fact of this development feels like an earthquake in the tech world because of&#xA0;<em>how</em>&#xA0;it was pulled off.</p><h2 id="3-ai-brings-the-impossible-within-reach"><br>3. AI brings the impossible within reach</h2><p>In a blog post announcing the project, Cloudflare claims only one engineer &#x201C;rebuilt&#x201D; the whole thing in a way that&#x2019;s trivial to deploy to Cloudflare&#x2019;s own infrastructure, and only cost $1,100 in tokens. From Cloudflare&#x2019;s&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">statement</a>:</p><blockquote>&#x201C;Last week, one engineer and an AI model rebuilt the most popular front-end framework from scratch. The result, vinext (pronounced &#x201C;vee-next&#x201D;), is a drop-in replacement for Next.js, built on Vite, that deploys to Cloudflare Workers with a single command. In early benchmarks, it builds production apps up to 4x faster and produces client bundles up to 57% smaller. And we already have customers running it in production.<br><br>The whole thing cost about $1,100 in tokens&#x201D;.</blockquote><p>What Cloudflare did:</p><ul><li>Took the Next.js public API</li><li>Reimplemented behaviour using Vite</li><li>Created build output whose behaviour matches the &#x201C;original&#x201D; Next.js implementation</li></ul><p>After 10 years, the core of Next has around 194,000 lines of code (LOC)**. Meanwhile,&#xA0;<a href="https://github.com/cloudflare/vinext?ref=blog.pragmaticengineer.com">vinext</a>&#xA0;is about 67,000 lines of code which suggests a much leaner implementation: for example, vinext does not need to support legacy Next APIs, and vinext currently supports 94% of the Next.js API (and it&#x2019;s safe to assume they left complex edge cases in the remaining 6%).<br><br>** the Next.js repository is closer to 2M lines of code: 1M is bundled dependencies (eg React bundles, CSS build etc), tests are 308,000 LOC, Turbopack 311,000 LOC.</p><p><strong>Pre-AI, this reimplementation would have taken years of engineering time to complete.&#xA0;</strong>Doing what Cloudflare did was always possible<em>&#xA0;in theory</em>, but never seemed practical. I mean, why have a team of engineers spend potentially years on generating a standardized build output for Next.js apps? Even if they did, the dev community would have doubts about whether Cloudflare would maintain the project.</p><p>This is the thing with forking or rewriting open source projects: a major value proposition for commercial open source is to know that they will be&#xA0;<em>maintained</em>. Vercel has proved it&#x2019;s a reliable custodian of Next.js for the past 10 years. Without AI, it could be assumed that any new reimplementation would eventually run out of steam.</p><p><strong>Separately but relatedly, Cloudflare has now proved that the cost of rewriting&#xA0;<em>existing</em>&#xA0;software has become ~100x cheaper, thanks to AI, and this economy is likely to be the case for maintenance, too.&#xA0;</strong>Considering how trivial it was to rebuild one of the more complex open source projects, this augers well for it being trivial and much cheaper to maintain in the future. Potentially, Cloudflare no longer needs to budget an engineering team only for maintenance, if a single engineer could maintain the project, part-time!</p><p>Cloudflare had a project measured in engineering years, and completed it in&#xA0;<em>one engineering week</em>! It just took a single engineer using&#xA0;<a href="https://opencode.ai/?ref=blog.pragmaticengineer.com">OpenCode</a>&#xA0;(open source coding agent), Opus 4.5, and a bunch of tokens, then: &#x2018;<em>boom&#x2019;</em>,&#xA0;<em>vinext</em>&#xA0;was born.</p><h2 id="4-%E2%80%9Cai-slop%E2%80%9D-still-an-issue">4. &#x201C;AI slop&#x201D; still an issue</h2><p>There are questions about the quality of vinext, though.<strong>&#xA0;</strong>Vercel, naturally, is unhappy and hit out at the obvious weakness that vinext is unfit for production usage because it&#x2019;s insecure. Vercel CEO, Guillermo Rauch, did not miss a beat by tying Cloudflare&#x2019;s effort to the &#x201C;vibe coding&#x201D; stereotype of sloppy work executed with a lack of understanding:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-2.png" class="kg-image" alt loading="lazy" width="1194" height="794" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/03/image-2.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/03/image-2.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/03/image-2.png 1194w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Guillermo Rauch&#xA0;</em></i><a href="https://x.com/rauchg/status/2026864132423823499?s=20&amp;ref=blog.pragmaticengineer.com" rel><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>Guillermo has a point: anyone who stopped reading&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">Cloudflare&#x2019;s launch announcement</a>&#xA0;after the first few sentences would assume it&#x2019;s production-ready, with the first paragraph of this announcement closing with:</p><p>&#x201C;And we already have customers running it in production.&#x201D;</p><p>However, Cloudflare doesn&#x2019;t&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com#status-experimental">share</a>&#xA0;the rather crucial detail that &#x201C;running in production&#x201D; means that vinext has been deployed onto a beta site, until more than 1,000 words (around 2&#x2013;3 pages) into the announcement:</p><blockquote>&#x201C;We want to be clear: vinext is experimental. It&#x2019;s not even one week old, and it has not yet been battle-tested with any meaningful traffic at scale. (...)<br><br>We&#x2019;ve been working with National Design Studio, a team that&#x2019;s aiming to modernize every government interface,&#xA0;<strong>on one of their beta sites</strong>, CIO.gov.</blockquote><p>Oh. So, &#x201C;customers running it in production&#x201D; at Cloudflare apparently means &#x201C;customer running a beta site in production without meaningful traffic.&#x201D; This is a first from the infrastructure giant, which usually prides itself on accurate statements!</p><p>This detail was also absent when Cloudflare&#x2019;s CEO and CTO&#xA0;<a href="https://x.com/eastdakota/status/2026389179345916255?s=20&amp;ref=blog.pragmaticengineer.com">were boosting</a>&#xA0;vinext like it was a mature, battle-tested product. In that context, Vercel&#x2019;s raising of the issue of security vulnerabilities is more than fair game, in my view.</p><p>Still, all that doesn&#x2019;t alter the core learning from this project: that AI has the power to drastically reduce engineering time by up to ~100x and deliver&#xA0;<em>usable-enough</em>&#xA0;output, for relatively negligible financial cost.&#xA0;<em>Just keep in mind that security and reliability issues will probably take plenty of extra time and effort to address.</em></p><h2 id="5-new-attack-vector-on-commercial-open-source">5. New attack vector on commercial open source?</h2><p>If arch-rivalries exist in tech, then Cloudflare and Vercel are a prime example. Both are gunning to become the most popular platform for developers to deploy their code, and the CEOs are regularly seen in public taking shots at the other side. One such spat happened&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/160004343/ceos-scrap?ref=blog.pragmaticengineer.com">in March</a>, as covered at the time:</p><blockquote>&#x201C;Things kicked off on social media, with developers confused about the severity of the incident, and about why Next.js seemed silent, and also why Cloudflare sites were breaking due to its fix for the CVE causing its own issues. It was at that point that Cloudflare&#x2019;s CEO, Matthew Prince, entered the chat to accuse Vercel of&#xA0;<a href="https://x.com/rauchg/status/1903590962498326771?ref=blog.pragmaticengineer.com">not caring about security</a>:<br><br>Given the security incident was ongoing, this felt a bit &#x201C;below the belt&#x201D; by the Cloudflare chief. Criticizing rivals is fair game, but why not wait until the incident is over? The punch landed, and Vercel&#x2019;s CEO Guillermo Rauch is not someone to take it lying down, so he&#xA0;<a href="https://x.com/rauchg/status/1903590962498326771?ref=blog.pragmaticengineer.com">hit back</a>.<br><br>Cloudflare&#x2019;s CEO then responded with a cartoon&#xA0;<a href="https://x.com/eastdakota/status/1903690805576909227?ref=blog.pragmaticengineer.com">implying</a>&#xA0;that although Vercel is much larger than its competitor Netlify, Cloudflare is 100x bigger than both, and could stomp them into the ground at will.&#x201D;</blockquote><p>Serving the public interest wasn&#x2019;t why Cloudflare rewrote&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>: they did it because they want Next.js sites to be deployed onto Cloudflare, but doing so made little sense until now because Next.js produced bespoke build output optimized for Vercel&#x2019;s infrastructure. With this change, Cloudflare&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">claims</a>&#xA0;it provides&#xA0;<em>superior&#xA0;</em>performance when hosting Next.js apps, according to their own measurements.</p><p><em>I&#x2019;d just add that performance is important for developers, but other things matter, too. Cost, reliability, developer experience, and how much devs like a company, are all factors in choosing between vendors. Also, performance measurements from a vendor about its own service must be taken with a large pinch of salt.</em></p><p><strong>Zooming out from this episode, it seems that AI is bringing the value of existing commercial open source moats into question.&#xA0;</strong>Vercel carved out a clever open source strategy that helped turn its open source investment into business revenue:</p><ol><li>Build and maintain Next.js, delivering the best developer experience (DX).</li><li>Optimize Vercel to serve the specific (and undocumented) build output of Next.js.</li><li>Most developers onboarding to Next.js will decide to deploy on Vercel to get the most benefit, in terms of DX and performance.</li><li>&#x2026; repeat for years while the business becomes worth billions! (Vercel was&#xA0;<a href="https://startupwired.com/2025/10/01/vercel-raises-300-million-reaches-9-3-billion-valuation/?ref=blog.pragmaticengineer.com">valued</a>&#xA0;at $9B last October).</li></ol><p>Underpinning this success are some assumptions:</p><ol><li>Next.js will remain the #1 choice for developers to build React applications, thanks to ongoing investment.</li><li>It is expensive to rewrite Next.js to be deployable and performant on another cloud vendor.</li><li>Even if someone did #2, developers would be skeptical and not switch over.</li></ol><p>Vercel can invest in #1 to keep Next as best-in-class, while knowing that the risk of #2 occurring is minor. However, Cloudflare has now &#x201C;cloned&#x201D; Next, and can easily keep up with all changes in the future, and port them back to vinext.</p><p><strong>But AI makes it trivial to &#x201C;piggyback&#x201D; off any commercial open source project, which is a massive problem for commercial open source startups.&#xA0;</strong>It puts all the effort and investment into building and maintaining&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>, while Cloudflare enjoys the benefit of this hard work (the Next.js public API) which is easily deployable to Cloudflare, and it can now undercut Vercel on price. For all future Next.js changes, Cloudflare will just sync it to vinext, using AI!</p><p>WordPress had&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/149770356/2-open-source-business-model-struggles-wordpress?ref=blog.pragmaticengineer.com">a similar problem</a>, with WP Engine &#x201C;piggybacking&#x201D; off its work and undercutting their pricing in 2024. As I analyzed at the time:</p><blockquote>&#x201C;Free-riding on permissive open source is too tempting to pass on for other vendors. WP Engine uses a common loophole of contributing almost nothing in R&amp;D to WordPress, while selling it as a managed service. This means that they could either easily undercut the pricing of larger players like Automattic which do spend on WordPress&#x2019;s R&amp;D. Alternatively, a company like WP Engine could charge as much, or more, as Automattic, but be able to spend a lot more on marketing, while being similarly profitable. &#x201C;Saving&#x201D; on R&amp;D gives the &#x201C;free-riders&#x201D; plenty of options to grow their businesses: options not necessarily open to Automattic while they invest as much into R&amp;D as they do.<br><br>Commercial open source vendors pressure to end &#x201C;freeriding&#x201D;. Automattic is likely facing lower revenue growth, with customers choosing vendors like WP Engine which offer a similar service &#x2014; getting these customers either via a cheaper price or thanks to more marketing spend. This legal fight could be an effort to force WP Engine to stop eating Automattic&#x2019;s lunch, or perhaps get WP Engine to sell to Automattic, which would cement its leading status in managed Wordpress, while also boosting revenue by $400M a year &#x2013; according to its own figures&#x201D;.</blockquote><p>Vercel managed to avoid the &#x201C;free-riding&#x201D; problem with&#xA0;<a href="http://next.js/?ref=blog.pragmaticengineer.com">Next.js</a>, but that&#x2019;s no longer possible now that AI makes it trivial to rewrite.</p><h2 id="6-defense-or-offense"><br>6. Defense or offense?</h2><p>How should commercial open source companies respond to the threat that a competitor can easily rewrite the software behind the managed solutions which they sell as services?</p><p><strong>One obvious response is to make tests private, so that replication is harder for AI.&#xA0;</strong>One thing that made it so easy for Cloudflare to rewrite Next was the project&#x2019;s comprehensive test suite. From&#xA0;<a href="https://blog.cloudflare.com/vinext/?ref=blog.pragmaticengineer.com">their announcement<u>&#xA0;</u></a>(emphasis mine):</p><blockquote>&#x201C;We also want to acknowledge the Next.js team. They&#x2019;ve spent years building a framework that raised the bar for what React development could look like.&#xA0;<strong>The fact that their</strong>&#xA0;API surface is so well-documented and their&#xA0;<strong>test suite so comprehensive</strong>&#xA0;is a big part of what made this project possible.&#x201D;</blockquote><p>Database solution SQLite is famous for its incredible test suite. What some people don&#x2019;t know is that while core&#xA0;<a href="https://sqlite.org/?ref=blog.pragmaticengineer.com">SQLite</a>&#xA0;tests are open source, its most comprehensive test suite &#x2013;&#xA0;<a href="https://sqlite.org/testing.html?ref=blog.pragmaticengineer.com">TH3</a>&#xA0;&#x2013; is closed source. SQLite monetizes its advanced infrastructure as a&#xA0;<a href="https://sqlite.org/prosupport.html?ref=blog.pragmaticengineer.com">service</a>&#xA0;for purchase. This is a fair tradeoff: for most contributors, the basic open source tests work well enough. For enterprise users or customers who really care about correctness, it makes sense to purchase advanced testing services from the service&#x2019;s creator.</p><p>Open source canvas project, tldraw,&#xA0;<a href="https://github.com/tldraw/tldraw/issues/8082?ref=blog.pragmaticengineer.com">announced</a>&#xA0;it will relocate its test suite to a closed source repository; a move which makes plenty of sense. Here&#x2019;s commentary from Simon Willison:</p><blockquote>&#x201C;It&#x2019;s become very apparent over the past few months that a comprehensive test suite is enough to build a completely fresh implementation of any open source library from scratch, potentially in a different language.&#x201D;</blockquote><p>In the event, tldraw&#x2019;s announcement turned out&#xA0;<a href="https://github.com/tldraw/tldraw/issues/8082?ref=blog.pragmaticengineer.com#issuecomment-3964650501">to be a joke</a>, but who&#x2019;s laughing now? An open source project with excellent tests is an easy target for an AI agent to execute a full rewrite of it.</p><p><strong>Could new licenses be created for the AI era?&#xA0;</strong>Existing open source licenses were created on the assumption that humans read open source code, and humans modify it. Agents break that assumption.</p><p>Could we see new license types emerge to ban AI agents from modifying projects&#x2019; source code? It seems pretty far-fetched and hard to implement, but not beyond the realms of possibility.</p><p>AI agents are still very new, and going mainstream in tech. Once they break into other industries, I wouldn&#x2019;t be surprised if legal frameworks are reworded to also apply to AI agents. If and when this happens, it would open the path for open source licenses to distinguish between agents and humans.</p><p><strong>What is a moat, if code can be trivially ported?&#xA0;</strong>A team operating a popular open source project can no longer assume it&#x2019;s expensive to fork or to be completely rewritten, meaning it makes sense to focus on other moats, such as:</p><ul><li><strong>Outstanding (paid) support.</strong>&#xA0;AI could make this much easier at a higher quality, if done right.</li><li><strong>Smaller open core, larger closed source part.&#xA0;</strong>&#x201C;Open core&#x201D; as a business model has been dominant for commercial open source: keep the core of the software open source, while advanced enterprise features are source available or closed source. I would expect more companies to move their additional services to closed source, not source available.</li><li><strong>In-person connection and community.</strong>&#xA0;Projects with a real-world community will form a sense of connection that goes beyond code. For example, it&#x2019;s hard to imagine vinext meetups popping up &#x2013; whereas there are many Next.js communities.</li><li><strong>Infrastructure and hardware remains a massive moat.&#xA0;</strong>In a world where software is trivial to copy, infrastructure remains a moat. Commercial open source might make most sense for players that own and operate superior infrastructure layers than their rivals: and being able to offer lower cost, higher reliability, lower latency, higher performance, or a combination of these.</li></ul><h2 id="7-ai-world-reality"><br>7. AI-world reality</h2><p><strong>One of the single best AI use cases is full-on rewrites of well-tested products.&#xA0;</strong>I estimate that AI sped up the creation of vinext by at least 100x, which is massive. But we don&#x2019;t really see efficiency boosts of anything like that with AI tools, in general. As Laura Tacho&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/189035949/1-data-vs-hype-how-orgs-actually-win-with-ai?ref=blog.pragmaticengineer.com">shared</a>&#xA0;at The Pragmatic Summit in San Francisco, the average self-reported efficiency &#x2018;AI gain&#x2019; seems to be circa 10%.</p><p>I suspect this vast chasm in efficiency boosts is because AI is many times more efficient at &#x201C;no-brainer tasks&#x201D; where correctness can be verified with tests, versus those which are more open ended or involve more creativity.</p><p><strong>In general, tests are incredibly important for efficient AI usage.&#xA0;</strong>On The Pragmatic Engineer Podcast, Peter Steinberger stressed how important &#x201C;closing the loop&#x201D; in his developer flow is by instructing the AI to test itself, and ensuring the AI has tests to run that verify correctness.</p><p>Automated tests were always considered a best practice for creating maintainable code. Now, having a codebase with extensive tests is the baseline to make AI agents work productively for refactors, rewrites &#x2013; or even adding new features and verifying that things did not break!</p><p><strong>Vendors will start to deploy &#x201C;migration AI agents&#x201D; to move customers over to their own stacks.&#xA0;</strong>This got lost in Cloudflare&#x2019;s announcement, but it&#x2019;s&#xA0;<a href="https://github.com/cloudflare/vinext?ref=blog.pragmaticengineer.com">important</a>:</p><blockquote>vinext includes an Agent Skill that handles migration for you. It works with Claude Code, OpenCode, Cursor, Codex, and dozens of other AI coding tools. Install it, open your Next.js project, and tell the AI to migrate:<br><br><em>&gt; npx skills add cloudflare/vinext</em><br><br>Then open your Next.js project in any supported tool and say:<br><br><em>&gt; migrate this project to vinext</em><br><br>The skill handles compatibility checking, dependency installation, config generation, and dev server startup. It knows what vinext supports and will flag anything that needs manual attention.</blockquote><p>This is very clever from Cloudflare, and a true &#x201C;AI-native&#x201D; move. They have not only used AI to migrate Next.js, but also built an &#x201C;AI plugin&#x201D; (a skill) to help customers migrate their existing codebases over to vinext &#x2013; and deploy on Cloudflare!</p><p>This move will surely be copied by other vendors, since migrations which are tedious for humans are much less effort with agents.</p><p><strong>AI is making the tech industry more ruthless when it comes to business practices.&#xA0;</strong>Laura Tacho said something interesting at The Pragmatic Summit:</p><blockquote>&#x201C;AI is an accelerator, it&#x2019;s a multiplier, and it is moving organizations in different directions.&#x201D;</blockquote><p>AI seems to be accelerating the ruthlessness of competition for customers and the speed at which this happens. In one week, Cloudflare rebuilt Next.js, and it&#x2019;s attacking Vercel full-on: claiming their &#x201C;vibe coded&#x201D; alternative is more performant and production-ready, and burying at the foot of the launch announcement the crucial information that vinext is very much experimental.</p><p>I sense vendors are realizing that there&#x2019;s a limited amount of time in which to use AI to their advantage, and some will decide to use it like Cloudflare has.</p><p><strong>On the other hand, AI could be great news for non-commercial open source.&#xA0;</strong>AI presents as a threat to commercial open source because it removes existing moats which make code hard to fully rewrite. However, beyond that, AI could help non-commercial open source to thrive:</p><ul><li>With AI, it&#x2019;s easy to fork an open source project and keep the fork in-sync with the original.</li><li>It&#x2019;s trivial to instruct AI to rewrite an open source project to another language or framework.</li><li>&#x2026;and it&#x2019;s equally trivial for AI to add features to a fork.</li></ul><p>For these reasons, I believe there could be a lot more forks and rewrites to come, and more open source projects and code, in general.</p><h2 id="takeaways"><br>Takeaways</h2><p>Personally, I could not have imagined things changing this quickly in software. Rewriting Next.js in a single week, even to a version that is not quite there &#x2013; but mostly works? This was out of the question as recently as a few months ago.</p><p>Things changed around last December, when Opus 4.5 and GPT-5.2 came out and proved capable&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com">of writing most of the code</a>. What used to be expensive is now cheap &#x2013; like rewriting complete projects &#x2013; and we still need to learn what the &#x201C;new&#x201D; expensive parts of software engineering are.</p><p>All this is new territory for everyone. To succeed in the tech industry, you need to be able to capitalize upon change, as Cloudflare has clearly done in this case by making the most of an opportunity created by new technology. It&#x2019;s unclear how popular vinext will become, and how much of a moat Vercel has around the broader Next.js ecosystem, but I suspect that it&#x2019;d take more than a Next rewrite to make Cloudflare into a viable Next.js platform-as-a-service provider.</p>]]></content:encoded></item><item><title><![CDATA[I replaced a $120/year micro-SaaS in 20 minutes with LLM-generated code]]></title><description><![CDATA[ I used to pay $120/year for a SaaS that hasn’t added new features in four years, and didn’t fix its broken billing system for three years. Using an LLM, I managed to rewrite all the functionality I used to pay for in 20 minutes. Is this bad news for “write once, don’t update later” SaaS?]]></description><link>https://blog.pragmaticengineer.com/i-replaced-a-120-year-micro-saas-in-20-minutes-with-llm-generated-code/</link><guid isPermaLink="false">697ba13c7779050001e3775d</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 29 Jan 2026 18:41:45 GMT</pubDate><content:encoded><![CDATA[<p>I have been sceptical of the manifold claims that software-as-a-service (SaaS) will be killed by LLMs. The theory behind this idea is:</p><ol><li>SaaS is a pure software product. People who pay SaaS vendors do so because it&#x2019;s cheaper to buy this software than build it.</li><li>LLMs dramatically reduce the time and cost of building custom software.</li><li>Therefore, most SaaS vendors will go out of business because most companies/teams will prompt an LLM to write the software they need, such as for ticketing, meetings, customer relationship management, etc.</li></ol><p>The reason for my scepticism has been that SaaS such as HR software Workday is&#xA0;<em>more</em>&#xA0;than just software. Workday, for example, keeps up with compliance requirements (e.g., for holiday pay in different countries), guarantees correctness (e.g., payslips that comply with local regulations), and over time the software keeps up to date with changes in the external and internal environments.</p><p><strong>However, this week I had first-hand experience of how ridiculously easy it is now to replace SaaS with LLMs.&#xA0;</strong>On my website &#x2013;&#xA0;<a href="http://pragmaticengineer.com/?ref=blog.pragmaticengineer.com">pragmaticengineer.com</a>&#xA0;&#x2013; I have a testimonials section, which displays real LinkedIn and X posts about this publication. It cost $120/year for a small service called&#xA0;<a href="https://shoutout.io/?ref=blog.pragmaticengineer.com">Shoutout.io</a>, and looked like this:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image.png" class="kg-image" alt loading="lazy" width="1390" height="1120" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/01/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/01/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image.png 1390w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Testimonials, nicely collected and rendered by Shoutout</em></i></figcaption></figure><p>And this is the backend: nothing fancy, just a way to add, edit, reorganize, and delete testimonials.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-1.png" class="kg-image" alt loading="lazy" width="1456" height="922" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/01/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/01/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-1.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Shoutout&#x2019;s admin interface</span></figcaption></figure><p>I was a customer for four years and logged in perhaps once a year. My latest login was to get an annual invoice for my expenses. Unfortunately, the billing section was broken, so I emailed support and they sent me a broken link instead of the invoice. This was frustrating: why pay for a SaaS with broken billing? I couldn&#x2019;t even tell what they would charge me next year.</p><p><strong>So I asked myself if I could rebuild my own use case with an LLM, and do it rapidly.&#xA0;</strong>My use case was much simpler than the SaaS itself:</p><ul><li>Display existing testimonials in a similar way</li><li>Make it easy to add new ones, e.g., store testimonials in some JSON format</li><li>Make it look good</li></ul><p>To my surprise, this whole effort from start to finish took exactly 20 minutes with Codex. The steps I took were straightforward enough:</p><ul><li>Asked Codex to make a plan on how to remove this third-party dependency and host all testimonials in my codebase (a GitHub repo, deployed onto Netlify)</li><li>Tweaked the plan: I pushed for a modular approach where testimonials are in a separate JSON file, and they get generated into HTML with a compile-time build step</li><li>Added this build step both locally and as a build trigger on Netlify</li><li>Tested the solution</li><li>Tweaked the UX and generated a schema</li><li>Deployed it</li></ul><p>The end result is visually the same as before, except I no longer have a third-party dependency rendering all of this!</p><h3 id="what-does-this-mean-for-saas-products-and-software-engineers">What does this mean for SaaS products and software engineers?</h3><p>What it means for software engineers:</p><ul><li><strong>Devs are (probably) a lot more comfortable using the command line for future updates than regular users.&#xA0;</strong>To add a future testimonial, I&#x2019;ll need to turn to my AI agent to insert it in my codebase, and I&#x2019;ll then need to verify that it looks good. This is not a big deal for me, but it might be a dealbreaker for someone not comfortable with verifying the code output of an LLM.</li><li><strong>It&#x2019;s a lot faster for a dev to &#x201C;port&#x201D; a SaaS than for anyone else.&#xA0;</strong>I first told Codex to copy the UI and it got things wrong because it tried to use a flexbox model. I had to tell it that this UI layout was not what I wanted, and then make the decision on which framework to use for the UI layout. A non-developer could probably figure all this out, but it would take longer.</li><li><strong>Honestly, it&#x2019;s fun and interesting to rewrite a third-party feature. I recommend it.&#xA0;</strong>Part of why I took on this project is because I expected it to be an interesting challenge. I thought the effort would be more than what it was, and I&#x2019;ve learned more about how well these tools work. I also used Codex in order to experience it more.</li></ul><p>What this could mean for SaaS software:</p><ul><li><strong>Rebuilding a SaaS still feels much harder than rebuilding&#xA0;<em>your specific</em>&#xA0;use case.&#xA0;</strong>I did not &#x201C;rebuild&#x201D; Shoutout in any way. Shoutout has 10x or more features, like adding quotes from 10 different platforms, authentication, billing (which didn&#x2019;t work for me), and more.</li><li><strong>A SaaS that doesn&#x2019;t give ongoing value is at risk of being replaced by customers.&#xA0;</strong>Shoutout doesn&#x2019;t provide ongoing value after it displays my testimonials, and this static nature means it&#x2019;s easy to replace. In contrast, it would be harder to rebuild if I paid for the platform to stay compliant, provide analytics or alerting, and do other real-time things that helped my business.</li><li><strong>Buying and selling SaaS businesses could become less profitable.&#xA0;</strong>The original version of Shoutout that I signed up for in 2021 was built in 2020 by an independent developer. In 2022, this developer&#xA0;<a href="https://www.indiehackers.com/post/my-startup-shoutout-has-been-acquired-0350ae659c?ref=blog.pragmaticengineer.com">sold this micro-SaaS</a>&#xA0;to a product studio. Then, in 2025, Shoutout&#xA0;<a href="https://x.com/davidsonkyle/status/1942207611006542317?s=20&amp;ref=blog.pragmaticengineer.com">was sold</a>&#xA0;again to new developers. From my point of view, nothing changed except that the billing system broke. I assume the buyers of this SaaS figured that revenue could keep rising with zero investment. But perhaps at some point that ceases to be true when people get fed up with a broken product and quit &#x2013; especially when doing so is cheaper.</li></ul><p><strong>&#x201C;Broken windows&#x201D; not being fixed is less acceptable than it used to be.&#xA0;</strong>My journey away from Shoutout began with its billing system being broken. For example, below is what I saw when I went to my billing section to see the invoices:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-2.png" class="kg-image" alt loading="lazy" width="1220" height="428" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2026/01/image-2.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2026/01/image-2.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2026/01/image-2.png 1220w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">A trigger to quit: Billing had been broken since 2023 and was never fixed</span></figcaption></figure><p>As well as this, the customer support sent me a broken link in response to my email. That was enough for me to decide to replace this dependency, and I was surprised by how easy this was with an LLM and knowing what I wanted it to build.&#xA0;<em>By the time customer support sent me a working link two hours later, I had finished migrating off the SaaS.</em></p>]]></content:encoded></item><item><title><![CDATA[The grief when AI writes most of the code]]></title><description><![CDATA[When AI writes almost all code, what happens to software engineering? There is grief involved for us developers, that's for sure.]]></description><link>https://blog.pragmaticengineer.com/the-grief-when-ai-writes-most-of-the-code/</link><guid isPermaLink="false">695eab59af96490001536b9c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Wed, 07 Jan 2026 18:53:57 GMT</pubDate><content:encoded><![CDATA[<p>I&#x2019;m coming to terms with the high probability that AI will write most of&#xA0;<em>my</em>&#xA0;code which I ship to prod, going forward. It already does it faster, and with similar results to if I&#x2019;d typed it out. For languages/frameworks I&#x2019;m less familiar with, it does a better job than me.</p><p>It feels like something valuable is being taken away, and suddenly. It took a&#xA0;<em>lot</em>&#xA0;of effort to get good at coding and to learn how to write code that works, to read and understand complex code, and to debug and fix when code doesn&#x2019;t work as it should. I still remember how daunting my first &#x201C;real&#x201D; programming class was at university (learning C), how lost I felt on my first job with a complex codebase, and how it took years of practice, learning from other devs, books, and blogs, to get better at the craft. Once you&#x2019;re pretty good, you have something that&#x2019;s valuable and easy to validate by writing code that works!</p><p>Some of my best memories of building software are about coding. Being &#x201C;locked in&#x201D; and balancing several ideas while typing them out, of being in the zone, then compiling the code, running it and seeing that &#x201C;<em>YES&#x201D;,</em>&#xA0;it worked as expected!</p><p>It&#x2019;s been a love-hate relationship, to be fair, based on the amount of focus needed to write complex code. Then there&#x2019;s all the conflicts that time estimates caused: time passes differently when you&#x2019;re locked in and working on a hard problem.</p><p>Now, all that looks like it will be history.</p><p>I wonder if I&#x2019;ll still get the same sense of satisfaction from the fact that writing complicated code is&#xA0;<em>hard</em>? Yes, AI is convenient, but there&#x2019;s also a loss.</p><p>Or perhaps with AI agents, being &#x201C;in the zone&#x201D; will shift to thinking about higher-level problems, while instructing more complex code to be written?</p><hr><p>This was a section from my analysis piece <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com" rel="noreferrer">When AI writes almost all code, what happens to software engineering?</a>. Read the full one <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com" rel="noreferrer">here</a>.</p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare’s latest outage proves dangers of global configuration changes (again)]]></title><description><![CDATA[Deja vu: a large Cloudflare outage caused by an instantly rolled-out global config change – two weeks after a similar problem]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflares-latest-outage/</link><guid isPermaLink="false">69443c5d272393000120055e</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 18 Dec 2025 17:44:21 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><em><u>last week&#x2019;s The Pulse</u></em></a><em> issue. Full subscribers received the below article seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em> <u>subscribe here</u></em></a><em>.</em></p><p>A mere two weeks after <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Cloudflare suffered a major outage</a> and took down half the internet, the same thing has happened again. Last Friday, 5th December, thousands of sites went down or partially down once more, in a global Cloudflare outage lasting 25 minutes.</p><p>As per last time, Cloudflare was speedy to share <a href="https://blog.cloudflare.com/5-december-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a full postmortem</a> on the same day. It estimated that 28% of Cloudflare&#x2019;s HTTP traffic was impacted. The cause of this latest outage was Cloudflare making a seemingly innocent &#x2013; but <em>global</em> &#x2013; configuration change that went on to take out a good portion of Cloudflare, <em>globally</em>, until being reverted. Here&#x2019;s what happened:</p><ul><li>Cloudflare was rolling out a fix for a nasty React security vulnerability</li><li>The fix caused an error in an internal testing tool</li><li>The Cloudflare team disabled the testing tool with a global killswitch</li><li>As this global configuration change was made, the killswitch unexpectedly caused a bug that resulted in HTTP 500 errors across Cloudflare&#x2019;s network</li></ul><p><strong>In this latest outage, Cloudflare was burnt by yet another global configuration change. </strong>The previous outage <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in November</a> happened thanks to a global database permissions change. In the postmortem of that incident, the Cloudflare team closed with this action item:</p><blockquote>&#x201C;Hardening ingestion of Cloudflare-generated configuration files in the same way we would for user-generated input&#x201D;</blockquote><p>This change would make it so that Cloudflare&#x2019;s configuration files do not propagate immediately to the full network, as they still do now. But making <em>all</em> global configuration files have staged rollouts is a large implementation that could take months. Evidently, there wasn&#x2019;t time to make it yet, and it has come back to bite Cloudflare.</p><p>Unfortunately for Cloudflare, customers are likely to find unacceptable a second outage with similar causes to a previous one, only weeks ago. If Cloudflare proves unreliable, customers should plan to onboard to <em>backup</em> CDNs at the very least, and a backup CDN vendor will do its best to convince new customers to use it as the primary CDN.</p><p>Cloudflare&#x2019;s value-add rests on rock-solid reliability without customers needing to budget for a backup CDN. Yes, publishing postmortems on the same day as an outage occurs helps restore trust, but that will crumble anyway with repeated large outages.</p><p><strong>To be fair, the company is doubling down on implementing staged configuration rollouts. </strong>In its postmortem, Cloudflare is its own biggest critic. CTO Dane Knecht <a href="https://blog.cloudflare.com/5-december-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reflected</a>:</p><blockquote>&#x201C;[Global configuration changes rolling out globally] remains our first priority across the organization. In particular, the projects outlined below should help contain the impact of these kinds of changes:<strong>Enhanced Rollouts &amp; Versioning:</strong> Similar to how we slowly deploy software with strict health validation, data used for rapid threat response and general configuration needs to have the same safety and blast mitigation features. This includes health validation and quick rollback capabilities among other things.<strong>Streamlined break glass capabilities: </strong>Ensure that critical operations can still be achieved in the face of additional types of failures. This applies to internal services as well as all standard methods of interaction with the Cloudflare control plane used by all Cloudflare customers.<strong>&#x201C;Fail-Open&#x201D; Error Handling: </strong>As part of the resilience effort, we are replacing the incorrectly applied hard-fail logic across all critical Cloudflare data-plane components. If a configuration file is corrupt or out-of-range (e.g., exceeding feature caps), the system will log the error and default to a known-good state or pass traffic without scoring, rather than dropping requests. Some services will likely give the customer the option to fail open or closed in certain scenarios. This will include drift-prevention capabilities to ensure this is enforced continuously.<br>These kinds of incidents, and how closely they are clustered together, are not acceptable for a network like ours&#x201D;.</blockquote><h3 id="global-configuration-errors-often-trigger-large-outages">Global configuration errors often trigger large outages</h3><p>There&#x2019;s a pattern of implicit or explicit global configuration errors causing large outages, and some of the biggest ones in recent years were caused by a single change being rolled out to a whole network of machines:</p><ul><li><strong>DNS and DNS-related systems like BGP:</strong> DNS changes are global by default, so it&#x2019;s no wonder that DNS changes can cause global outages. Meta&#x2019;s <a href="https://en.wikipedia.org/wiki/2021_Facebook_outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">7-hour outage in 2021</a> was related to DNS changes (more specifically, Border Gateway Protocol changes.) Meanwhile, the AWS outage in October <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">started with</a> the internal DNS system.</li><li><strong>OS updates happening at the same time, globally: </strong>Datadog&#x2019;s <a href="https://newsletter.pragmaticengineer.com/p/inside-the-datadog-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">2023 outage</a> cost the company $5M and was caused by Datadog&#x2019;s Ubuntu machines executing an OS update within the same time window, globally. It caused issues with networking, and it didn&#x2019;t help that Datadog ran its infra on 3 different cloud providers across 3 networks. The same kind of Ubuntu update also <a href="https://newsletter.pragmaticengineer.com/p/why-reliability-is-hard-at-scale?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">caused a global outage</a> for Heroku in 2024.</li></ul><p><strong>Globally replicating configs: </strong><a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in 2024</a>, a configuration policy change was rolled out globally and crashed every Spanner database node straight away. As Google concluded in <a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">its postmortem</a>: &#x201C;Given the global nature of quota management, this metadata was replicated globally within seconds&#x201D;.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/12/image.png" class="kg-image" alt loading="lazy" width="1456" height="970" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/12/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/12/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/12/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Step 2 &#x2013; replicating a configuration file globally across GCP &#x2013; </em></i><a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">caused a global outage</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> in 2024</em></i></figcaption></figure><p>Implementing gradual rollouts for <em>all</em> configuration files is a <em>lot</em> of work. It&#x2019;s also invisible labor because when done well, then its benefits will be undetectable, except in the absence of incidents, thanks to better infrastructure!</p><p><strong>The largest systems in the world will likely have to implement safer ways to roll out configs &#x2013; but not everybody needs to. </strong>Staged configuration rollout doesn&#x2019;t make much sense for smaller companies and products because this infra work slows down product development.</p><p>It doesn&#x2019;t just slow down building, but every deployment, too, and this friction is designed to make everything slower. As such, they don&#x2019;t make much sense unless the stability of mature systems is more important than fast iteration.</p><p>Software engineering is a field where tradeoffs are a fact of life, and universal solutions don&#x2019;t exist. The development which worked for a system with 1/100th of the load and users a year ago, may not make sense today.</p><p><em>This was one out of the four topics covered in this week&#x2019;s The Pulse. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>The full edition</em></a><em> additionally covers:</em></p><ol><li><strong>Industry Pulse.&#xA0;</strong>Poor capacity planning at AWS, Meta moves to a &#x201C;closed AI&#x201D; approach, a looming RAM shortage, early-stage startups hiring slower than before, how long it takes to earn $600K at Amazon and Meta, Apple loses execs to Meta, and more</li><li><strong>How the engineering team at Oxide uses LLMs.&#xA0;</strong>They find LLMs great for reading documents and lightweight research, mixed for coding and code review, and a poor choice for writing documents &#x2013; or any kind of writing, really!</li><li><strong>Linux officially supports Rust in the kernel.&#xA0;</strong>Rust is now a first-class language inside the Linux kernel, eight months after a Linux Foundation Fellow&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/how-linux-is-built-with-greg-kroah?ref=blog.pragmaticengineer.com">predicted</a>&#xA0;more support for Rust. A summary of the pros and cons of Rust support for Linux</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><strong>Read the full The Pulse issue</strong></a><strong>.</strong></p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Could a 5-day RTO be around the corner for Big Tech?]]></title><description><![CDATA[From next February, workers at Instagram must be in the office, five days a week. This makes Meta the second tech giant after Amazon to mandate a 5-day RTO. Will more big companies do the same?]]></description><link>https://blog.pragmaticengineer.com/the-pulse-could-a-5-day-rto-be-around-the-corner-for-big-tech/</link><guid isPermaLink="false">693b1247dd0e8a0001c79f46</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Sat, 13 Dec 2025 15:21:25 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-155?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p>A year ago, Amazon became the first tech giant to bring staff back into the office for the full five days per week. Back then, I&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/149104874/what-does-amazons-day-rto-mean-for-tech?ref=blog.pragmaticengineer.com">analyzed</a>&#xA0;the reasons for the change, and whether other workplaces would follow suit by dropping the widespread hybrid policy of 2-3 days/week in the office.</p><p>Now, Meta employees in the Instagram division have become the latest subjects of a full return to the office, following an announcement by the social media platform this week.</p><h3 id="instagram%E2%80%99s-5-day-return-to-office">Instagram&#x2019;s 5-day return to office</h3><p>Instagram employees&#xA0;<a href="https://sources.news/p/instagrams-return-to-office-mandate?ref=blog.pragmaticengineer.com">received the unexpected email on Monday</a>, reports fellow Substacker, Alex Heath, who acquired a copy of the message. It was sent internally by Instagram CEO Adam Mosseri, who wrote:</p><blockquote>&#x201C;<strong>1. Back to the office:</strong>&#xA0;I believe that we are more creative and collaborative when we are together in-person. (...)<br><br><strong>2. Fewer meetings:</strong>&#xA0;We all spend too much time in meetings that are not effective, and it&#x2019;s slowing us down. Every six months, we&#x2019;ll cancel all recurring meetings and only re-add the ones that are absolutely necessary (...)<br><br><strong>3. More demos, less decks:</strong>&#xA0;Most product overviews should be prototypes instead of decks.<br><br><strong>4. Faster decision-making:</strong>&#xA0;We&#x2019;re going to have a more formalized unblocking process with DRIs, and I&#x2019;ll be at the priorities progress unblocking meeting every week.&#x201D;</blockquote><p>This decision by Meta affects around a quarter of company staff, and it&#x2019;s hard to imagine other divisions not following Instagram&#x2019;s lead; after all, everything in Mosseri&#x2019;s memo likely applies across the business.</p><p>Five years ago, CEO Mark Zuckerberg predicted 50% of Meta staff would work remotely by now, which didn&#x2019;t happen. Indeed, with Instagram&#x2019;s new 5-day RTO, I&#x2019;d be surprised if 5% of Meta folks work remotely in two years&#x2019; time.</p><p><strong>The reason for Insta&#x2019;s RTO seems rooted in the leadership&#x2019;s belief that in-office is more productive,&#xA0;</strong>as indicated by the top bullet point of Mosseri&#x2019;s message. That message in full:</p><p>&#x201C;I believe that we are more creative and collaborative when we are together in-person. I felt this pre-COVID and I feel it any time I go to our New York office where the in-person culture is strong.</p><p>Starting February 2, I&#x2019;m asking everyone in my rollup based in a US office with assigned desks to come back full time (five days a week). The specifics:</p><ul><li>You&#x2019;ll still have the flexibility to work from home when you need to, since I recognize there will be times you won&#x2019;t be able to come into the office. I trust you all to use your best judgment in figuring out how to adapt to this schedule.</li><li>In the NY office, we won&#x2019;t expect you to come back full time until we&#x2019;ve alleviated the space constraints. We&#x2019;ll share more once we have a better sense of timeline.</li><li>In MPK [Menlo Park, the HQ], we&#x2019;ll move from MPK21 to MPK22 on January 26 so everyone has an assigned desk. We&#x2019;re also offering the option to transfer from the MPK to SF office for those people whose commute would be the same or better with that change. We&#x2019;ll reach out directly to those people with more info.</li><li>XFN [cross-functional] partners will continue to follow their own org norms.</li><li>There is no change for employees who are currently remote&#x201D;.</li></ul><p>From what I&#x2019;ve seen of Mosseri from afar, he seems like a pretty straight shooter. It&#x2019;s clear that he feels in-office creates more energy, and in Mosseri&#x2019;s defense, I hear similar from many startup founders and leaders who say remote work causes a bunch of headaches: it&#x2019;s harder to spot motivational problems and performance issues, information travels more slowly, and rallying teams is harder.</p><p><strong>There&#x2019;s no doubt that running a full-remote company is a lot of effort.&#xA0;</strong>There&#x2019;s often-overlooked labor involved in hiring, onboarding, performance management, team celebrations, and even company-wide meetings &#x2013; none of it is easy.</p><p>Linear is a full-remote company with nearly 50 people working there, which&#xA0;<a href="https://linear.app/now/designing-remote-work-at-linear?ref=blog.pragmaticengineer.com">recently published details about how it operates</a>. They&#x2019;re introducing the concept of &#x201C;coworking hubs&#x201D;, flying in teams for in-person events, and holding regular off-sites, while being careful to hire people who fit the culture.</p><p><strong>My feeling is that remote work policies at tech companies are going to become questions of their leaders&#x2019; preferences.&#xA0;</strong>Many devs prefer remote work: there&#x2019;s fewer interruptions, more deep focus, and less commuting. Most of us would probably be just as productive &#x2013; and probably more so &#x2013; than when being interrupted in-office.</p><p>Leaders who prefer full-remote can cite flexibility and easier hiring from a larger pool of candidates as clear benefits. Meanwhile, those most comfortable with in-person will always have enough reasons to justify a 5-day RTO, along the lines of Mosseri&#x2019;s reasoning. Advocates of hybrid setups cite balancing of focus time and efficiency.</p><p>In today&#x2019;s job market, any company that pays closer to the top of the market can probably get away with five-days-a-week RTO. Meta is in this space, and although I&#x2019;m sure plenty of devs will dislike the change, the alternative is to go out on the job market, accept a pay cut to join a new company, and start rebuilding your internal network.</p><p>Since we&#x2019;re in the&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025?ref=blog.pragmaticengineer.com">midst of a weird job market</a>, it makes switching jobs more difficult than before, when the job market was very hot. In this respect, Instagram has external conditions on its side. For devs at Meta, one upside is that Big Tech experience&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/tech-jobs-market-2025-part-3?ref=blog.pragmaticengineer.com">opens more doors</a>, even in this tough job market.</p><p>One caveat is that a 5-day RTO is unlikely in places where it&#x2019;s hard to hire the right people. So, AI engineers and those working on AI products should be pretty safe, for instance, because those roles are&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/172584839/ai-engineering-trends?ref=blog.pragmaticengineer.com">incredibly in-demand</a>, as indicated by the&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/165280420/new-trend-higher-base-salaries-for-ai-engineers?ref=blog.pragmaticengineer.com">trend of higher base salaries for AI engineers</a>. Based on that, few companies should want to push those workers to quit to join competitors.</p><p></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/" rel="noopener noreferrer nofollow"><em> an email you could send to your manager</em></a><em>.</em></p>]]></content:encoded></item><item><title><![CDATA[Downdetector and the real cost of no upstream dependencies]]></title><description><![CDATA[During the Cloudflare outage, Downdetector was also unavailable. I got details from the team about why they have a hard dependency on Cloudflare, and why that won’t change anytime soon.]]></description><link>https://blog.pragmaticengineer.com/downdetector-and-the-real-cost-of-no-upstream-dependencies/</link><guid isPermaLink="false">6932a20b097ffa00013da35c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 05 Dec 2025 09:14:50 GMT</pubDate><content:encoded><![CDATA[<p><em>The below is one out of five topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>The Pulse #154.</em></a><em> Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em><u>subscribe here</u></em></a><em>.</em></p><p><em>Many subscribers expense The Pragmatic Engineer Newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em><u> an email you could send to your manager</u></em></a><em>.</em></p><hr><p>One amusing detail of the <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noreferrer">November 2025 Cloudflare outage</a> is that the realtime outage and monitoring service, Downdetector, went down, revealing a key dependency on Cloudflare. At first, this looks odd; after all, Downdetector is about monitoring uptime, so why would it take on a key dependency like Cloudflare if it means this can happen?</p><p><strong>Downdetector was built multi-region and multi-cloud,</strong>&#xA0;which<strong>&#xA0;</strong>I confirmed by talking with Senior Director of Engineering,&#xA0;<a href="https://x.com/damndhruv?ref=blog.pragmaticengineer.com">Dhruv Arora</a>, at Ookla, the company behind Downdetector. Multi-cloud resilience makes little sense for most products, but Downdetector was built to detect cloud provider outages, as well. And for this, they needed to be multi-cloud!</p><p>Still, Downdetector uses Cloudflare for DNS, Content Delivery (CDN), and Bot Protection. So, why would it take on this one key dependency, as opposed to hosting everything on its own servers?</p><p><strong>A CDN has advantages that are hard to ignore,&#xA0;</strong>such as:</p><ul><li>Drastically lower bandwidth costs &#x2013; assets cached on the CDN are much faster</li><li>Faster load times because assets on a CDN are served from Edge nodes nearer users</li><li>Protection from sudden traffic spikes, as would be common for Downdetector, especially during outages! Without a CDN, those spikes could overload their services</li><li>DDoS protection from bad actors taking the site offline with a distributed denial of service attack</li><li>Reduced infrastructure requirements, as Downdetector can run on fewer servers</li></ul><p>Downdetector&#x2019;s usage patterns reflect that it&#x2019;s a service very heavily used by consumers whom the business doesn&#x2019;t really monetize (Downdetector is free to use.) So, Downdetector could get rid of Cloudflare, but costs would surge, the site would become slower to load, and revenue wouldn&#x2019;t change.</p><p>In the end, Downdetector&#x2019;s dependence on Cloudflare could be a pragmatic choice based on the business model, and how removing its upstream dependency upon Cloudflare could get very expensive!</p><p>Dhruv confirmed this and sharing more about the design choices at Downdetector:</p><blockquote>&#x201C;<strong>Building redundancy at the DNS &amp; CDN layers would require enormous overhead.</strong>&#xA0;This is especially true as Cloudflare&#x2019;s Bot Protection is world-class, and building similar functionality would be a lot of effort. There are hyperscalers [cloud providers] that have this kind of redundancy built in. We will look into what we can do, but with a team size in the double digits, building up a core piece of infra like this is a pretty tall order: not just for us, but for any mid-sized team.<br><br>We&#x2019;ve learned that there are more things that we can improve, for the future. For example, during the outage, the Cloudflare control pane was down, but their API wasn&#x2019;t. So, us having more Infrastructure as Code could have helped bring back Downdetector sooner.<br><br>On our end, we also noticed that the outage wasn&#x2019;t global, so we were able to shift traffic around and reduce the impact.<br><br>One more interesting detail: Cloudflare&#x2019;s Bot Protection went haywire during the outage, and started to block legitimate traffic. So, our team had to turn that off temporarily&#x201D;.</blockquote><p>Thanks very much to Dhruv and the Downdetector team for sharing details.</p>]]></content:encoded></item><item><title><![CDATA[A startup in Mongolia translated my book]]></title><description><![CDATA[A 30-person startup called Nasha Tech translated The Software Engineer's Guidebook for the benefit of their company and the Mongolian tech ecosystem.]]></description><link>https://blog.pragmaticengineer.com/traveling-to-mongolia/</link><guid isPermaLink="false">69206cafc3b7150001d419bf</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 21 Nov 2025 13:47:17 GMT</pubDate><content:encoded><![CDATA[<p>I published <a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer">The Software Engineer&apos;s Guidebook</a> two years ago. <em> I shared more details on how I self-published the book, and the learnings from publishing </em><a href="https://newsletter.pragmaticengineer.com/p/the-software-engineers-guidebook?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>in this post.</em></a></p><p>An unexpected highlight of publishing the book was ending up in Mongolia in June of this year, at a small-but-mighty startup called <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech</a>. This was because the startup translated my book into Mongolian. Here&apos;s the completed book:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-21-at-15.34.01.png" class="kg-image" alt loading="lazy" width="1078" height="1292" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/Screenshot-2025-11-21-at-15.34.01.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/Screenshot-2025-11-21-at-15.34.01.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-21-at-15.34.01.png 1078w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Software Engineer&apos;s Guidebook, in Mongolian. You can </span><a href="https://internom.mn/%D0%B1%D0%B0%D1%80%D0%B0%D0%B0/9789919053185-%D1%81%D0%BE%D1%84%D1%82%D0%B2%D1%8D%D0%B9%D1%80-%D0%B8%D0%BD%D0%B6%D0%B5%D0%BD%D0%B5%D1%80%D0%B8%D0%B9%D0%BD-%D1%85%D3%A9%D1%82%D3%A9%D1%87-%D0%BD%D0%BE%D0%BC?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;">buy this translation here</span></a></figcaption></figure><p>Here&#x2019;s what happened:</p><p>A little over a year ago, a small startup from Mongolia reached out, asking if they could translate the book. I was skeptical it would happen because the unit economics appeared pretty unfavorable. Mongolia&#x2019;s population is 3.5 million; much smaller than other countries where professional publishers had offered to do a translation (Taiwan: 23M, South Korea: 51M, Germany: 84M, Japan: 122M, China: 1.43B people).</p><p>But I agreed to the initiative, and expected to hear nothing back. To my surprise, nine months later the translation was ready, and the startup printed 500 copies on the first run. They invited me to a book signing in the capital city of Ulaanbaatar, and soon I was on my way to meet the team, and to understand why a small tech company translated my book!</p><h3 id="japanese-startup-vibes-in-mongolia">Japanese startup vibes in Mongolia</h3><p>The startup behind the translation is called <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech</a>; a mix of a startup and a digital agency. Founded in 2018, its main business has been agency work, mainly for companies in Japan. They are a group of 30 people, mostly software engineers.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-1.png" class="kg-image" alt loading="lazy" width="1086" height="1264" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image-1.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image-1.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-1.png 1086w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Nasha Tech&#x2019;s offices in Ulaanbaatar, Mongolia</span></figcaption></figure><p>Their offices resembled a mansion more than a typical workplace, and everyone takes their shoes off when arriving at work and switches to &#x201C;office slippers&#x201D;. I encountered the same vibe later <a href="https://newsletter.pragmaticengineer.com/i/177384640/cursor-push-for-release?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">at Cursor&#x2019;s headquarters in San Francisco</a>, in the US.</p><p>Nasha Tech found a niche of working for Japanese companies thanks to one of its cofounders studying in Japan, and building up connections while there. Interestingly, another cofounder later moved to Silicon Valley, and advises the company from afar.</p><p><strong>The business builds the &#x201C;Uber Eats of Mongolia&#x201D;. </strong>Outside of working as an agency, Nasha Tech builds its own products. The most notable is called TokTok, the &#x201C;UberEats of Mongolia&#x201D;, which is the leading food delivery app in the capital city. The only difference between TokTok and other food delivery apps is scale: the local market is smaller than in some other cities. At a few thousand orders per day, it might not be worthwhile for an international player like Uber or Deliveroo to enter the market.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-2.png" class="kg-image" alt loading="lazy" width="1456" height="646" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image-2.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image-2.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-2.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The </em></i><a href="https://www.toktok.mn/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">TokTok</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> app: a customer base of 800K, 500 restaurants, and 400 delivery riders</em></i></figcaption></figure><p>The tech stack Nasha Tech typically uses:</p><ul><li>Frontend: React / Next, Vue / Nuxt, TypeScript, Electron, Tailwind, Element UI</li><li>Backend and API: NodeJS (Express, Hono, Deno, NestJS), Python (FastAPI, Flask), Ruby on Rails, PHP (Laravel), GraphQL, Socket, Recoil</li><li>Mobile: Flutter, React Native, Fastlane</li><li>Infra: AWS, GCP, Docker, Kubernetes, Terraform</li><li>AI &amp; ML: GCP Vertex, AWS Bedrock, Elasticsearch, LangChain, Langfuse</li></ul><p>AI tools are very much widespread, and today the team uses Cursor, GitHub Copilot, Claude Code, OpenAI Codex, and Junie by Jetbrains.</p><p><strong>I detected very few differences between Nasha Tech and other &#x201C;typical&#x201D; startups I&#x2019;ve visited, in terms of the vibe and tech stack. </strong>Devs working on TokTok were very passionate about how to improve the app and reduce the tech debt accumulated by prioritizing the launch. A difference for me was the language and target market: the main language in the office is, obviously, Mongolian, and the products they build like TokTok also target the Mongolian market, or the Japanese one when working with clients.</p><p>One thing I learned was that awareness about the latest tools has no borders: back in June, a dev at Nasha Tech was already telling me that Claude Code was their daily driver, even though the tool had been released for barely a month at that point!</p><h3 id="why-translate-the-book-into-mongolian">Why translate the book into Mongolian?</h3><p>Nasha Tech was the only non-book publisher to express interest in translating the book. But why did they do it?</p><p>I was told the idea came from software engineer <a href="https://x.com/ssuuribaatar?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Suuribaatar Sainjargal</a>, who bought and enjoyed the English-language version. He <a href="https://x.com/GergelyOrosz/status/1937160382600343964?s=20&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">suggested</a> translating the book so that everyone at the company could read it, not only those fluent in English.</p><p>Nasha Tech actually had some in-house experience of translation. A year earlier, in 2024, the company translated Matt Mochary&#x2019;s <a href="https://www.amazon.com/Great-CEO-Within-Tactical-Building-ebook/dp/B07ZLGQZYC?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Great CEO Within</a> as a way to uplevel their leadership team, and to help the broader Mongolian tech ecosystem.</p><p>Also, the company&#x2019;s General Manager, <a href="https://www.linkedin.com/in/battsengel/?originalSubdomain=mn&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Batutsengel Davaa</a>, happened to have been involved in translating more than 10 books in a previous role. He took the lead in organizing this work, and here&#x2019;s how the timelines played out:</p><ul><li>Professional translator: 3 months</li><li>Technical editor revising the draft translation: 1 month</li><li>Technical editing #2 by a Support Engineer in Japan: 2 months</li><li>Technical revision: 15 engineers at Nasha Tech revised the book, with a &#x201C;divide and conquer&#x201D; approach: 2 months</li><li>Final edit and print: 1 month</li></ul><p>This was a real team effort. Somehow, this startup managed to produce a high-quality translation in around the same time as it took professional book publishers in my part of the world to do the same!</p><p>A secondary goal that Nasha Tech had was to advance the tech ecosystem in Mongolia. There&#x2019;s understandably high demand for books in the mother tongue; I observed a number of book stands selling these books, and book fairs are also popular. The translation of my book has been selling well, where you can <a href="https://internom.mn/%D0%B1%D0%B0%D1%80%D0%B0%D0%B0/9789919053185-%D1%81%D0%BE%D1%84%D1%82%D0%B2%D1%8D%D0%B9%D1%80-%D0%B8%D0%BD%D0%B6%D0%B5%D0%BD%D0%B5%D1%80%D0%B8%D0%B9%D0%BD-%D1%85%D3%A9%D1%82%D3%A9%D1%87-%D0%BD%D0%BE%D0%BC?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">buy the book</a> for 70,000 MNTs (~$19).</p><h3 id="book-signing-and-the-mongolian-startup-scene">Book signing and the Mongolian startup scene</h3><p>The book launch event was at Mongolia&#x2019;s startup hub, called <a href="https://digitalnomad.itpark.mn/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">IT Park</a>, which offers space for startups to operate in. I met a few working in the AI and fintech spaces &#x2013; and even one startup producing comics.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-3.png" class="kg-image" alt loading="lazy" width="1378" height="1184" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image-3.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image-3.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image-3.png 1378w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Book launch event, and meeting startups inside Mongolia&#x2019;s IT Park</span></figcaption></figure><p>I had the impression that the government and private sector are investing heavily in startups, and want to help more companies to become breakout success stories:</p><ul><li><a href="https://digitalnomad.itpark.mn/ds_in_mongolia?ref=blog.pragmaticengineer.com#ds" rel="noopener noreferrer nofollow">IT Park report</a>: the country&#x2019;s tech sector is growing ~20%, year-on-year. The <em>combined</em> valuation of all startups in Mongolia is at $130M, today.<em> It&#x2019;s worth remembering that location is important for startups: being in hubs like the US, UK, and India confers advantages that can be reflected in valuations.</em></li><li><a href="https://www.jica.go.jp/overseas/mongolia/sjp04ove1698/__icsFiles/afieldfile/2024/08/28/Summary.pdf?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Mongolian Startup Ecosystem Report 2023</a>: the average pre-seed valuation of a startup in Mongolia is $170K, seed valuation at $330K, and Series A valuation at $870K. The numbers reflect market size; for savvy investors, this could also be an opportunity to invest early. I met a Staff Software Engineer at the book signing event who is working in Silicon Valley at Google, and invests and advises in startups in Mongolia.</li><li><a href="https://drive.google.com/file/d/1Ath-eOMd4Kr924cq1AkgLekfeJlXCBfd/view?usp=sharing&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Mongolian startup ecosystem Map</a>: better-known startups in the country.</li></ul><p>Two promising startups from Mongolia: <a href="https://chimege.com/en/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Chimege</a> (an AI+voice startup) <a href="https://and.global/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">AND Global</a> (fintech). Thanks very much to the <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech team</a> for translating the book &#x2013; keep up the great work!</p><h2 id></h2>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare takes down half the internet – but shares a great postmortem]]></title><description><![CDATA[A database permissions change ended up knocking Cloudflare’s proxy offline. Pinpointing the root cause was tricky – but Cloudflare shared a detailed postmortem. Also: announcing The Pragmatic Summit]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflare-takes-down-half-the-internet/</link><guid isPermaLink="false">691f7b63e9904f00015006db</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 20 Nov 2025 20:36:19 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of five topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com"><em>this week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em>subscribe here</em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em> an email you could send to your manager</em></a><em>.</em></p><hr><p>Before we start: I&#x2019;m excited to share something new: <strong>The Pragmatic Summit.</strong></p><p>Four years ago, The Pragmatic Engineer started as a small newsletter: me writing about topics relevant for engineers and engineering leaders at Big Tech and startups. Fast forward to today, and the newsletter <a href="https://newsletter.pragmaticengineer.com/p/one-million?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">crossed one million readers</a>, and the publication expanded with <a href="https://newsletter.pragmaticengineer.com/podcast?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a podcast</a> as well.</p><p>One thing that was always missing: meeting in person. Engineers, leaders, founders&#x2014;people who want to meet others in this community, and learn from each other. Until now that is:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/TPS_Social_RegLive_1200x627_110625.png" class="kg-image" alt loading="lazy" width="1200" height="627" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/TPS_Social_RegLive_1200x627_110625.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/TPS_Social_RegLive_1200x627_110625.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/TPS_Social_RegLive_1200x627_110625.png 1200w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Pragmatic Summit. </span><a href="https://www.pragmaticsummit.com/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><span style="white-space: pre-wrap;">See more details and apply to attend</span></a></figcaption></figure><p>In partnership with <a href="http://statsig.com/pragmatic?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Statsig</a>, I&#x2019;m hosting the first-ever <a href="https://www.pragmaticsummit.com/?utm_source=the-pragmatic-engineer&amp;utm_medium=newsletter&amp;utm_campaign=nov-20-paid-edition" rel="noopener noreferrer nofollow"><strong>Pragmatic Summit</strong></a>. Seats are limited, and tickets are priced at $499, covering the venue, meals, and production&#x2014;we&#x2019;re not aiming to make any profit from this event.</p><p><a href="https://www.pragmaticsummit.com/?ref=blog.pragmaticengineer.com">Apply to attend the Summit</a></p><p>I hope to see many of you there!</p><hr><h2 id="cloudflare-takes-down-half-the-internet-%E2%80%93-but-shares-a-great-postmortem">Cloudflare takes down half the internet &#x2013; but shares a great postmortem</h2><p>On Tuesday came another reminder about how much of the internet depends on Cloudflare&#x2019;s content delivery network (CDN), when thousands of sites went fully or partially offline in an outage that lasted 6 hours. Some of the higher-profile victims included:</p><ul><li>ChatGPT and Claude</li><li>Canva, Dropbox, Spotify,</li><li>Uber, Coinbase, Zoom</li><li>X and Reddit</li></ul><p>Separately, you may or may not recall that during a different recent outage caused by AWS, Elon Musk noted on his website, X, that AWS is a hard dependency for Signal, meaning an AWS outage could take down the secure messaging service at any moment. In response, a dev pointed out that it is the same for X with Cloudflare &#x2013; and so it proved earlier this week, when X was broken by the Cloudflare outage.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!IN2n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9cfc94-1792-4a5e-8fb6-c1815df54ff0_1072x898.png" class="kg-image" alt loading="lazy" width="1072" height="898"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Predicting the future. Source: Mehul Mohan </em></i><a href="https://x.com/mehulmpt/status/1980382080602370144?s=20&amp;ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>That AWS outage was in the company&#x2019;s us-east-1 region and <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">took down a good part of the internet</a> last month. AWS released incident details three days later &#x2013; unusually speedy for the e-commerce giant &#x2013; although that postmortem was high-level and we never learned <em>exactly</em> what caused AWS&#x2019;s <a href="https://newsletter.pragmaticengineer.com/i/176934094/how-dynamodb-dns-management-happens?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">DNS Enactor</a> service to slow down, triggering an unexpected race condition that kicked off the outage.</p><h3 id="what-happened-this-time-with-cloudflare">What happened this time with Cloudflare?</h3><p>Within hours of mitigating the outage, Cloudflare&#x2019;s CEO Matthew Prince shared an <a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">unusually detailed report </a>of what exactly went wrong. The root cause was to do with propagating a configuration file to Cloudflare&#x2019;s Bot Management module. The file crashed Bot Management, which took Cloudflare&#x2019;s proxy functionality offline.</p><p>Here&#x2019;s a brief overview of how Cloudflare&#x2019;s proxy layer works at a high level. It&#x2019;s the layer that protects the &#x201C;origin&#x201D; resources of customers &#x2013; minimizing network traffic to them by blocking malicious requests and caching static resources in Cloudflare&#x2019;s CDN:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!esOT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F132ad7a8-2c1d-4be1-8174-295941979ceb_1420x1312.png" class="kg-image" alt loading="lazy" width="1420" height="1312"><figcaption><i><em class="italic" style="white-space: pre-wrap;">How Cloudflare&#x2019;s proxy works. More details on </em></i><a href="https://blog.cloudflare.com/20-percent-internet-upgrade/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare&#x2019;s engineering blog</em></i></a></figcaption></figure><p>Here&#x2019;s how the incident unfolded:</p><p><strong>A database permissions change in </strong><a href="https://en.wikipedia.org/wiki/ClickHouse?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>ClickHouse</strong></a><strong> kicked things off. </strong>Before the permissions changed, all queries to fetch feature metadata (to be used by the Bot Management module) would have only been run on distributed tables in Clickhouse, in a database called &#x201C;default&#x201D; which contains 60 features.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!NEwO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f62c0a-5772-45a3-9be1-24e7c15c4e7b_1264x264.png" class="kg-image" alt loading="lazy" width="1264" height="264"><figcaption><span style="white-space: pre-wrap;">Before the permissions change: about 60 features were returned, that were fed to the Bot Module</span></figcaption></figure><p>Until now, these queries were running using a shared system account. Cloudflare&#x2019;s engineering team wanted to improve system security and reliability, and move from this shared system account to individual user accounts. User accounts already had access to another database called &#x201C;r0&#x201D;, so the team made the database permission change for access to r0 to be <em>implicit</em> instead of explicit.</p><p>As a side effect of this, the same query collecting the features to be passed to Bot Management started to fetch from the r0 database, and return many more features than expected:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!p5bm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e62f91e-7078-4b9d-8e2f-3b3fb357aef5_1220x252.png" class="kg-image" alt loading="lazy" width="1220" height="252"><figcaption><span style="white-space: pre-wrap;">After the permissions change: the query did not change but returned twice as many results</span></figcaption></figure><p><strong>The Bot Management module does not allow loading of more than 200 features. </strong>This limit was well above the production usage of 60, and was put in place for performance reasons: the Bot Management module pre-allocates memory for up to 200 features, and it will not operate with more than this number.</p><p><strong>A </strong><a href="https://en.wikipedia.org/wiki/Kernel_panic?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>system panic</strong></a><strong> hit machines served with the incorrect feature file. </strong>Cloudflare was nice enough to share the exact code that caused this panic, which was this unwrap() function:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!qih4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8462b639-2c4c-4c8d-91b2-a468f97d7ee4_1606x666.png" class="kg-image" alt loading="lazy" width="1456" height="604"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Source: </em></i><a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare</em></i></a></figcaption></figure><p>What likely happened:</p><ul><li>The append_with_names() function likely checked for a limit of 200 features</li><li>If it saw more than 200 features, it likely returned an error</li><li>&#x2026; and when writing the code, it was not expected that append_with_names() would return an error&#x2026;</li><li>&#x2026; and so .unwrap() panicked and crashed the system!</li></ul><p><strong>Edge nodes started to crash, one by one, seemingly randomly. </strong>The feature file was being generated every 5 minutes, and gradually rolled out to Edge nodes. So, initially, it was only a few nodes that crashed, and then over time, more became non-responsive. At one point, both good and bad configuration files were being distributed, making failed nodes that received the good configuration file start working &#x2013; for a while!</p><h3 id="why-so-long-to-find-the-root-cause">Why so long to find the root cause?</h3><p>It took Cloudflare engineers unusually long &#x2013; 2.5 hours! &#x2013; to figure all this out, and that an incorrect configuration file propagating to Edge servers was to blame for their proxy going down. Turns out, an unrelated failure made the Cloudflare team suspect that they were under a coordinated botnet attack, as when a few of the Edge nodes started to go offline, the company&#x2019;s status page did, too:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!Xa8F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F565ff3fa-112f-4500-940a-4f3f241991fd_1999x478.png" class="kg-image" alt loading="lazy" width="1456" height="348"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare&#x2019;s status page went offline when the outage started. Source: </em></i><a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare</em></i></a></figcaption></figure><p>The team tried to gather details about the attack, but there was no attack, meaning they wasted time looking in the wrong place. In reality, the status page going down was a coincidence and unrelated to the outage. But it&#x2019;s easy to see why their first reaction was to figure out if there was a distributed denial of service (DDoS) attack.</p><p>As mentioned, it eventually took 2.5 hours to pinpoint the incorrect configuration files as the source of the outage, and another hour to stop the propagation of new files, and create a new and correct file, which was deployed 3.5 hours after the start of the incident. Cleanup took another 2.5 hours, and at 17:06 UTC, the outage was resolved, ~6 hours after it started.</p><p>Cloudflare shared a detailed review of the incident and learnings, which can be <a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">read here.</a></p><h3 id="how-did-the-postmortem-come-so-fast">How did the postmortem come so fast?</h3><p>One thing that keeps being surprising about Cloudflare is how they have a very detailed postmortem up in less than 24 hours after the incident is resolved. Cofounfer and CEO Matthew Prince <a href="https://news.ycombinator.com/user?id=eastdakota&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">explained</a> how this was possible:</p><ul><li>Matthew was part of the outage call.</li><li>After the outage was resolved, he wrote a first version of the incident review, at home. Matthew was in Lisbon, in Cloudflare&#x2019;s European HQ, so this was early evening</li><li>The team circulated a Google Doc with this initial writeup, and questions that needed to be reviewed</li><li>In a few hours, all questions were answered</li><li>Matthew: &#x201C;None of us were happy [about the incident] &#x2014; we were embarrassed by what had happened &#x2014; but we declared it [the postmortem] true and accurate.</li><li>Sent the draft over to the SF team, who did one more sweep, the posted it</li></ul><p>Talk about moving with the speed of a startup, despite being a publicly traded company!</p><h3 id="learnings">Learnings</h3><p>There is much to learn from this incident, such as:</p><p><strong>Be explicit about logging errors when you raise them! </strong>Cloudflare could probably have identified the root cause of this error much faster if the line of code that returned an error, also logged the error, and if Cloudflare had alerts set up when certain errors spiked on its nodes. It could have surely shaved an hour or two off the time it took to mitigate.</p><p>Of course, logging errors before throwing them is extra work, but when done with monitoring or log analysis, it can help find the source of errors much faster.</p><p><strong>Global database changes are always risky. </strong>You never know what part of the system you might hit.<strong> </strong>The incident started with a seemingly innocuous database permissions change that impacted a wide range of queries. Unfortunately, there is no good way to test the impact of such changes (if you know one, please leave a comment below!)</p><p>Cloudflare was making the right kind of change by removing global systems accounts; it&#x2019;s a good direction to go in for security and reliability. It was extremely hard to predict the change would end up taking down a part of their system &#x2013; and the web.</p><p><strong>Two things going wrong at the same time can really throw an engineering team. </strong>If Cloudflare&#x2019;s status page did not go offline, the engineering team would have surely pinpointed the problem much faster than they did. But in the heat of the moment, it&#x2019;s easy to assume that two small outages are connected, until there&#x2019;s evidence that they&#x2019;re not. Cloudflare is a service that&#x2019;s continuously under attack, so the engineering team can&#x2019;t be blamed for assuming it might be more of the same.</p><p><strong>CDNs are the backbone of the internet, and this outage doesn&#x2019;t change that. </strong>The outage hit lots of large businesses, resulting in lost revenue for many. But could affected companies have prepared better for Cloudflare going down?</p><p>The problem is that this is hard: using a CDN means taking on a <em>hard</em> dependency in order to reduce traffic on your own servers (the origin servers), while serving internet users faster and more cheaply:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!54wJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dca2f86-18b2-4ba8-8fd2-bc7236b330db_1194x280.png" class="kg-image" alt loading="lazy" width="1194" height="280"><figcaption><span style="white-space: pre-wrap;">A CDN is a common way to reduce traffic to servers and serve webpages and APIs faster to users</span></figcaption></figure><p>When using a CDN, you propagate addresses that point to that CDN server&#x2019;s IP or domain. When the CDN goes down, you could start to redirect traffic to your own origin servers (and deal with the traffic spike), or utilize a backup CDN, if you prepared for this eventuality.</p><figure class="kg-card kg-image-card"><img src="https://substackcdn.com/image/fetch/$s_!fj68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ef266a-4a28-429b-9d01-52a34e03eae0_1248x774.png" class="kg-image" alt loading="lazy" width="1248" height="774"></figure><p>Both these are expensive to pull off:</p><ul><li>Redirecting to the origin servers likely means needing to suddenly scale up backend infrastructure</li><li>Having a backup CDN means there must be a contract and payment for a CDN partner which will most likely sit idle. As and when it is needed, you must switch over and warm up their cache: it&#x2019;s a lot of effort and money to do this!</li></ul><p>A case study in the trickiness of dealing with a CDN going offline is the story of Downdetector, including inside details on why Downdetector went down during Cloudflare&#x2019;s latest outage, and what they learned from it.</p><hr><p><em>This was one out of the five topics covered in this week&#x2019;s The Pulse. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The full edition</em></a><em> additionally covers:</em></p><ol><li><strong>Downdetector &amp; the real cost of no upstream dependencies.</strong> During the Cloudflare outage, Downdetector was also unavailable. I got details from the team about why they have a hard dependency on Cloudflare, and why that won&#x2019;t change anytime soon.</li><li><strong>Antigravity: Google&#x2019;s new AI IDE &#x2013; that its devs cannot use. </strong>Google wants to become a serious player in AI coding tools, but Antigravity contains remnants of Windsurf. Interestingly, devs at Google aren&#x2019;t allowed to use Antigravity for work</li><li><strong>Industry pulse.</strong> Gemini 3 launch, Anthropic valued at $350B, Jeff Bezos funds an AI company, and unusually slow headcount growth at startups persists.</li><li><strong>Five AI fakers caught in 1 month by crypto startup. </strong>Candidates who fake their backgrounds and change their looks in remote interviews continue to plague companies hiring full-remote &#x2013; especially crypto startups.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Read the full The Pulse</strong></a></p>]]></content:encoded></item><item><title><![CDATA[Four years on writing a tech book: pitching to a publisher]]></title><description><![CDATA[<p>In 2019, I decided to write a book about software engineering. As an experienced software engineer and manager, I had the topic clear in my head, and assumed the whole project would take between six and 12 months in writing and publishing it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png" class="kg-image" alt loading="lazy" width="1456" height="866" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The first proof copy of</span><a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;"> The Software</span></a></figcaption></figure>]]></description><link>https://blog.pragmaticengineer.com/four-years-on-writing-a-tech-book-pitching-to-a-publisher/</link><guid isPermaLink="false">69130a5abb6a4e00013466cc</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Tue, 11 Nov 2025 10:45:43 GMT</pubDate><content:encoded><![CDATA[<p>In 2019, I decided to write a book about software engineering. As an experienced software engineer and manager, I had the topic clear in my head, and assumed the whole project would take between six and 12 months in writing and publishing it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png" class="kg-image" alt loading="lazy" width="1456" height="866" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/image.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/image.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The first proof copy of</span><a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;"> The Software Engineer&#x2019;s Guidebook</span></a><span style="white-space: pre-wrap;"> &#x2013; hence the &#x201C;not for resale&#x201D; markup</span></figcaption></figure><p>In the end, this process took several times longer; 4 years, in fact! Happily, it was worth it: readers&#x2019; feedback about <a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com"><strong><u>The Software Engineer&#x2019;s Guidebook</u></strong></a> has been overwhelmingly positive, and on launch, the book became a <a href="https://twitter.com/GergelyOrosz/status/1723205530481729838?ref=blog.pragmaticengineer.com"><u>#1 bestseller</u></a> among all titles in two Amazon markets (the Netherlands and Poland), as well as a top 100-selling book in most Amazon markets. In 24 months it sold around 40,000 copies, and was translated into <a href="https://learning.oreilly.com/library/view/guidebook-fur-software/9783960092513/?ref=blog.pragmaticengineer.com"><u>German</u></a>,<a href="https://www.hanbit.co.kr/store/books/look.php?p_code=B2570473158&amp;ref=blog.pragmaticengineer.com"> <u>Korean</u></a>, <a href="https://x.com/GergelyOrosz/status/1936044091009036690?ref=blog.pragmaticengineer.com"><u>Mongolian</u></a> and<a href="https://x.com/GergelyOrosz/status/1973632590541365384?ref=blog.pragmaticengineer.com"> <u>Traditional Chines</u></a> &#x2013; with the Japanese and simplified Chinese versions releasing later this month.</p><p>A lot of people ask why I chose to self publish, and it would be nice to say this was always the goal, but it wasn&#x2019;t! Originally, I wanted to work with a top tech publisher, who would get the book to market fast, and give it a higher profile. This didn&#x2019;t happen, but during the process I learned a lot about how publishing works, how to pitch a book, and how to choose which publishing route might be the right one.&#xA0;</p><p>This article shares my learnings from writing and publishing a book which has done pretty well with readers, and it includes the experience working with an established publishing house:</p><ol><li>Tech book publishing landscape</li><li>Financials of publishing</li><li>Publishing process and the publisher&#x2019;s role</li><li>My book pitch</li><li>Working with a publisher</li><li>Breaking up with a publisher</li></ol><h2 id="1-tech-book-publishing-landscape">1. Tech book publishing landscape</h2><p>Today, there are reputable book publishers whose titles are good and authoritative, and there are other publishers whom this doesn&#x2019;t apply to. Each publisher also has a subject area: some are mainstream and publish titles about every software engineering area from languages to engineering management. Meanwhile, others stick to a topic of expertise they focus on.</p><p>Here&#x2019;s my mental model of the book publishing industry in 2025:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png" class="kg-image" alt loading="lazy" width="1600" height="1356" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 1600w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Biggest players in the tech book publishing industry, a subjective mental model of course!</em></i></figcaption></figure><h4 id="highly-reputable-mainstream-publishers">Highly reputable mainstream publishers</h4><p>In tech book publishing, three publishing houses really stand out, in my opinion, and form a &#x2018;big three&#x2019; among all players in this sector:&#xA0;</p><ul><li><a href="https://oreilly.com/?ref=blog.pragmaticengineer.com"><strong><u>O&#x2019;Reilly</u></strong></a>: if I had to pick a #1 tech book publisher, it would be O&#x2019;Reilly. They publish some of the most referenced books &#x2013; like Designing Data Intensive Applications by Martin Kleppmann, <a href="https://newsletter.pragmaticengineer.com/p/dead-code-getting-untangled-and-coupling?ref=blog.pragmaticengineer.com"><u>Tidy First</u></a> by Kent Beck, <a href="https://newsletter.pragmaticengineer.com/p/the-staff-engineers-path?ref=blog.pragmaticengineer.com"><u>The Staff Engineer&#x2019;s Path</u></a> by Tanya Reilly, and more. The book covers are distinctive, using images of animals.</li><li><a href="https://www.manning.com/?ref=blog.pragmaticengineer.com"><strong><u>Manning</u></strong></a>: a broad range of titles on both specific and general tech topics, which employ historical figures on the covers.</li><li><a href="https://pragprog.com/?ref=blog.pragmaticengineer.com"><strong><u>The Pragmatic Bookshelf</u></strong></a>: also referred to as the &#x201C;Prags.&#x201D; Founded by Andy Hunt and Dave Thomas, the authors of what might be the best-selling tech book ever; The Pragmatic Programmer. Since its founding, The Prags has refused digital rights management (DRM) on their ebooks.</li></ul><h4 id="high-reputable-%E2%80%9Cmainstream%E2%80%9D-publishers-that-are-tough-to-pitch-to">High reputable &#x201C;mainstream&#x201D; publishers that are tough to pitch to</h4><p>The publishers in this section have strong reputations, like those above. However, they are harder to pitch to, usually because they publish fewer tech books. I couldn&#x2019;t find an author pitch template, or clear pitching instructions, and contributes to a sense of &#x201C;don&#x2019;t find us, we&#x2019;ll find you&#x201D; among the following publishing houses:&#xA0;</p><ul><li><a href="https://en.wikipedia.org/wiki/Addison-Wesley?ref=blog.pragmaticengineer.com"><strong><u>Addison-Wesley:</u></strong></a> one of the best-known brands in tech. It has been an imprint (a trade name within a publication) of Pearson since 1988, and is the publisher of many &#x201C;classic&#x201D; book titles like Clean Code by Robert C. Martin, The Pragmatic Programmer by Andy Hunt and Dave Thomas, and some recent ones like Modern Software Engineering by Dave Farley. I couldn&#x2019;t find any way to pitch to this publisher, and new books they publish seem to be by established authors.</li><li><a href="https://www.pearson.com/?ref=blog.pragmaticengineer.com"><strong><u>Pearson</u></strong></a>: This business owns the Addison-Wesley imprint. Recently, it started to publish tech books as &#x201C;Pearson&#x201D; instead, author Martin Fowler <a href="https://twitter.com/martinfowler/status/1766836423808766003?ref=blog.pragmaticengineer.com"><u>shared</u></a>.</li><li><a href="https://www.wiley.com/en-us?ref=blog.pragmaticengineer.com"><strong><u>Wiley</u></strong></a>: formerly a well-known tech book publisher behind the &#x201C;X for Dummies&#x201D; series. It publishes lots of <a href="https://www.wiley.com/en-nl/etextbooks-and-courseware/computer-science-and-technology?ref=blog.pragmaticengineer.com"><u>computer science textbooks</u></a>, but I can&#x2019;t find recently-published, well-known <em>tech books </em>for software engineers.</li><li><a href="https://www.springer.com/gp?ref=blog.pragmaticengineer.com"><strong><u>Springer</u></strong></a>: another massive publisher for whom tech books are a small part of the business. I couldn&#x2019;t find how to pitch tech books to them.</li><li><a href="https://booksite.mkp.com/?ref=blog.pragmaticengineer.com"><strong><u>Morgan Kaufmann</u></strong></a>: a well-known tech books publisher founded in 1984, and acquired in 2001 by Elsevier. As I understand, these days it prints far fewer technology book, and focuses on academic topics. No clear way to pitch to them.</li></ul><h4 id="highly-reputable-%E2%80%9Cniche%E2%80%9D-publishers">Highly reputable &#x201C;niche&#x201D; publishers</h4><p>The following publishers are standout in quality, covering fewer topics than those above.</p><ul><li><a href="https://nostarch.com/?ref=blog.pragmaticengineer.com"><strong><u>No Starch Press</u></strong></a>: &#x201C;The finest in geek entertainment&#x201D; is the tagline, featuring fun visuals, and high-quality content on specific technologies like machine learning, Python, JavaScript, etc.</li><li><a href="https://itrevolution.com/?ref=blog.pragmaticengineer.com"><strong><u>IT Revolution</u></strong></a>: titles for technology leaders: DevOps, technology delivery, workplace culture, and similar. Publisher of The Phoenix Project, Team Topologies, and Accelerate.</li><li><a href="https://www.artima.com/books?ref=blog.pragmaticengineer.com"><strong><u>Artima</u></strong></a>: focuses on Scala.</li><li><a href="https://www.routledge.com/go/crc-press?ref=blog.pragmaticengineer.com"><strong><u>CRC Press</u></strong></a>: publishes on technology, engineering, math, and medicine.</li><li><a href="https://press.stripe.com/?ref=blog.pragmaticengineer.com"><strong><u>Stripe Press</u></strong></a>: &#x201C;works about technological, economic, and scientific advancement.&#x201D;</li><li><a href="https://mitpress.mit.edu/?ref=blog.pragmaticengineer.com"><strong><u>MIT Press</u></strong></a>: &#x201C;a distinctive collection of influential books curated for scholars and libraries worldwide.&#x201D;</li></ul><h4 id="other-mainstream-book-publishers">Other mainstream book publishers</h4><p><a href="https://www.apress.com/?ref=blog.pragmaticengineer.com"><strong><u>Apress</u></strong></a> is a reputable publisher with a lower profile, which publishes on a wide range of topics, from specific technologies and frameworks, to more generic topics on computing. Because they publish many books on many topics, they are usually open to pitches.</p><p><a href="https://www.packtpub.com/?ref=blog.pragmaticengineer.com"><strong><u>Packt</u></strong></a>. A tech book publisher with a focus on quantity over quality, it feels to me. There is limited support and feedback for authors, and titles could often use more editing. But also, Packt is likely to say &#x201C;yes&#x201D; to a serious proposal.</p><h2 id="2-financials-of-publishing">2. Financials of publishing</h2><p>Financial matters really come into play when your proposal is accepted by a publisher and you receive a contract offer.</p><p><strong>Advance: $2,000 &#x2013; $5,000. </strong>An advance payment to the writer is a tried and tested way to make them deliver a completed manuscript. It&#x2019;s often paid in chunks: 50% when a milestone is hit, and 50% when a full draft appears.</p><p>The &#x201C;big three&#x201D; publishers typically offer $5,000, usually as a flat, non-negotiable rate; at least, it&#x2019;s what I was offered. Smaller publishers offer closer to $2,000 for more niche books. The advance is non-refundable; even if your book sells zero copies, you keep it. The publisher is making an investment in you, and taking a risk.</p><p><em>As an aside: if you are thinking of writing a book: for guest authors in The Pragmatic Engineer Newsletter guest authors I offer a $4,000 per article payment &#x2013; and you can later publish your guest article in a book. Several authors working on their book have written a guest articles such as Lou Franco on </em><a href="https://newsletter.pragmaticengineer.com/p/paying-down-tech-debt?ref=blog.pragmaticengineer.com"><em><u>Paying down tech debt</u></em></a><em> or Apurva Chitnis on </em><a href="https://newsletter.pragmaticengineer.com/p/thriving-as-a-founding-engineer?ref=blog.pragmaticengineer.com"><em><u>Thriving as a founding engineer</u></em></a><em>. Writing a guest post can help refine ideas, broaden your reach, and prove helpful when publishing the article.</em></p><h4 id="paperback-royalty-7-15"><strong>Paperback royalty:</strong> 7-15%&#xA0;</h4><p>Royalties are earned on book sales, and taken from the net price of the book. Net price is what a publisher gets after the retailer (e.g. Amazon, or a bookshop) takes their cut. Let&#x2019;s see how it works for a $40 book:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.11.21.png" class="kg-image" alt loading="lazy" width="1396" height="1080" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/Screenshot-2025-11-11-at-11.11.21.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/Screenshot-2025-11-11-at-11.11.21.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.11.21.png 1396w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The royalty from a $40 book that has a 10% royalty can be anywhere from $4 to around $1.80, depending on the channel it was sold on. It all depends on how much revenue the publisher received after the sale.</em></i></figcaption></figure><p>It matters financially where your title is purchased; be it an online shop, physical book store, or purchased directly from the publisher. Many tech books are sold on Amazon and online stores. Amazon&#x2019;s 40% cut seems high, but it&#x2019;s actually the lowest among book retailers. Up to 60% is a common cut for a physical bookshop.</p><p>Most publishers offer 10-12.5% royalties, is my understanding, and Packt around 15-20%. Keep in mind that brand reputation plays a role; for example, Packt&#x2019;s reputation is less elevated than Manning, which can make a difference to sales.</p><h4 id="ebook-royalties-10-25">Ebook royalties: 10-25%</h4><p>For ebooks, several publishers pay 25% royalties, but not all. But even with a higher royalty rate, an author might end up making less per sale. For example, on the Kindle platform, the cut for Amazon is high at 65%. Let&#x2019;s look at a $30 ebook with a 20% royalty rate:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.12.30.png" class="kg-image" alt loading="lazy" width="1412" height="914" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/Screenshot-2025-11-11-at-11.12.30.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/Screenshot-2025-11-11-at-11.12.30.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/Screenshot-2025-11-11-at-11.12.30.png 1412w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">ebooks are cheaper, but authors can earn more with this royalty structure. Selling on Kindle version is the least profitable because Amazon takes 65% of any sale above $10</em></i></figcaption></figure><p>Ebooks are almost always priced lower than physical books, and when sold on Kindle, generate much less revenue for the author, while earning more per copy than the paperback version. <em>I was offered 10% royalties on ebook sales, which is at the low end.</em></p><h4 id="%E2%80%9Cearning-out%E2%80%9D">&#x201C;Earning out&#x201D;&#xA0;</h4><p>When an author needs to pay back an advance before being paid anything, this is called &#x201C;earning out&#x201D;. If you get a $5,000 advance for a title costing $40 per hard copy and $25 for the ebook version, and most sales happen on Amazon, it means:</p><ul><li>~2,080 paperback sales on Amazon</li><li>Or ~2,850 Kindle book sales</li><li>Or ~1,250 paperback sales on the publisher website</li></ul><p>The author needs to sell at least 1,000 copies across various platforms to &#x201C;earn out.&#x201D; The good news is that a publisher sends quarterly or annual royalty payments if a book keeps generating revenue, which would effectively be passive income.</p><h4 id="the-prags%E2%80%99-unique-approach">The Prags&#x2019; unique approach</h4><p>One publisher that calculates rates differently is The Pragmatic Bookshelf. Instead of offering a low-digit number on <em>revenue</em>, they offer a 50% split on <em>profit</em>.</p><p>50% on profit sounds much higher than 10% on revenue, right? However, the devil is in the details, because paying on profit means that the upfront publisher costs &#x2013; editors, cover design, printing, distribution, marketing &#x2013; all are deducted before any profit split.</p><p>Authors who have used this approach tell me the numbers end up pretty similar to the revenue model.</p><h4 id="real-world-case-studies-with-actual-earnings">Real-world case studies with actual earnings</h4><p><a href="https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html?ref=blog.pragmaticengineer.com"><u>Designing Data Intensive Applications</u></a> author, Martin Kleppmann, shared the cumulative royalties he made in 6 years. The breakdown is interesting; ebook and Safari Online sales generated more revenue for the writer than the print version.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png" class="kg-image" alt loading="lazy" width="1400" height="800" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 1400w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cumulative royalties for Designing Data Intensive Applications, published by O&#x2019;Reilly. Image source: </em></i><a href="https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html?ref=blog.pragmaticengineer.com"><u><i><em class="italic underline" style="white-space: pre-wrap;">Martin Kleppman&#x2019;s site</em></i></u></a></figcaption></figure><p><a href="https://rothgar.medium.com/the-economics-of-writing-a-technical-book-689d0c12fe39?ref=blog.pragmaticengineer.com"><u>Cloud Native Infrastructure earnings</u></a>: author Justin Garrison published with O&#x2019;Reilly, and was offered 10% for print and 25% for ebooks (split into half, thanks to working with a coauthor). His book sold 1,337 copies in 4 months; and made about $22,000 for the two authors (and around $11,000 for Justin.) Justin concluded:</p><blockquote>&#x201C;Going into this project I had a rough estimate in my head to make about $2000&#x2013;3000 so this is much better than I expected. Set your expectations accordingly.&#x201D;</blockquote><p><strong>Don&#x2019;t forget that publishers are also in this to make a positive return.</strong> This means that it is unlikely for a highly reputable publisher to invest into a book that they do not believe would sell at least a few thousand copies. I don&#x2019;t have the data here: but if I was a publisher, I would reject any book that didn&#x2019;t look like it could hit 1,000 copies sold in the first year of publishing.</p><h2 id="3-the-publishing-process-and-publisher-roles">3. The publishing process, and publisher roles</h2><p>Why does a publisher take so much of the revenue? Part of this is because they do a lot of the work around publishing, and need to hire (and pay!) people for those roles. Here is my understanding of how the publishing process works, based on four months of pitching to publishers; two months of working with one of them; and researching how the rest of the process works:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png" class="kg-image" alt loading="lazy" width="1320" height="1568" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 1320w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">My understanding of the publishing process, when working with a publisher. You probably get to work with quite a few specialized folks!</em></i></figcaption></figure><p>Here are people I worked with, and my experience with them:</p><p><strong>The acquisitions editor. </strong>If you write a technical blog, you might get a reachout from someone called an acquisitions editor, who will ask if you would consider publishing a book. Also, when you submit a pitch to a publisher, chances are that you will first communicate with an acquisitions editor.</p><p>A publisher&#x2019;s goal is to publish books that will be profitable for them. They find authors who could write these books two ways:</p><ul><li>Inbound pitches coming from authors &#x2013; reviewed by editors or acquisitions editors</li><li>External reachouts done by acquisitions editors</li></ul><p>These people need to have a good understanding of what kinds of books sell well at the publisher (and why); what their current catalogue is; what the gaps are; and what competitor publishers are commissioning.</p><p>When I pitched my book to 3 respected publishers, in two cases I talked with (and worked with) the acquisitions editor to improve my pitch. The acquisitions editors were my &#x201C;champions&#x201D; at the publisher. Their goal was to get a pitch that the company <em>would</em> say yes to.</p><p><strong>The development editor </strong>works on the <em>structure</em> of the book. They ask the author to come up with a detailed table of contents &#x2013; in my case, they asked me to estimate even the length of the chapters. They also help develop &#x2013; and maintain &#x2013; the narrative of the book.</p><p>Had I not worked with a publisher, I would have had no appreciation of this &#x201C;high-level editing&#x201D; &#x2013; which, turns out, is key for writing a well-structured tech book!</p><p><strong>The project manager</strong> checks in with timelines, organizes reviews&#x2014;like editorial reviews&#x2014;and helps keep you accountable. One of the best things about working with a publisher is that you are on a tight deadline&#x2014;without which it would take you several times longer to publish the book!</p><p><strong>The publisher owns a lot of rights for your book! </strong>One thing that I realized only after signing with a publisher is that while publishers help a lot with writing the book &#x2013; and taking a higher cut is sensible because of this &#x2013; they also hold on to a lot of rights that impact your book! These are all things that you give up on, versus when self-publishing. These are:</p><ul><li><strong>Global publishing rights</strong>. Although you are the author of the book &#x2013; and usually hold the copyright to it &#x2013; the publisher own wordlwide publishing rights. This means that they are the only ones who can publish the book, or longer excerpts of it. In practice, this means you need to get permission if you&#x2019;d like to publish some parts of your book on e.g. your blog, or social media. <em>They&#x2019;ll usually grant this as it&#x2019;s good marketing &#x2013; but it&#x2019;s still that you need to ask, as the author.</em></li><li><strong>Foreign rights. </strong>The publisher will own the publishing right, and will usually be the one who owns selling foreign rights. In theory, this could sound like you are losing out on things. In pratice, publishers are much better positioned to sell and administer these rights. Most publishers offer a 50% cut on these rights &#x2013; it&#x2019;s what my publisher offered. <em>Also, the majority of tech books are not translated to other languages &#x2013; a book that &#x201C;only&#x201D; sells 2,000 copies in English is unlikely to sell a significant number in a non-English market!</em></li><li><strong>The cover. </strong>The publisher decides what cover they will design, though they tend to check the author for feedback.</li><li><strong>The title.</strong> One of the surprises for me was how the publisher <em>ultimately</em> decides on the title and subtitle.</li></ul><p>In short: this book is owned by the publisher. You are the author, but they are the only ones who can distribute it. In practice, many authors would prefer to have it this way &#x2013; because all the work related to distributing the book is taken on by the publisher. However, it&#x2019;s good to know that you need to give up all the above when working with a publisher.</p><h2 id="4-my-book-pitch">4. My book pitch</h2><p>My secret hope, back in 2019, was to get a contract with one of the &#x201C;Big 3&#x201D; tech book publishers: O&#x2019;Reilly, Manning or The Prags. I pitched my book to all three: got a &#x201C;no&#x201D; from two, but a &#x201C;yes&#x201D; (and a contract) from one. Here&#x2019;s how I went about my pitch.</p><h4 id="write-a-%E2%80%9Cone-pager%E2%80%9D-about-your-book">Write a &#x201C;one-pager&#x201D; about your book</h4><p>What will this book be about? Who is it for? What will readers take away when reading it? Answer these in a short pitch, <em>before</em> even seeking out publishers. Here&#x2019;s what I put together as my &#x201C;one-pager:&#x201D;</p><h4 id="do-some-market-research">Do some market research</h4><p>What are similar books in the market that would be competing with this book, directly or indirectly? How is this book different from them?</p><p>What is the demographic of people who would be interested in buying this book? Can you estimate how large this crowd is? Realistically, what percentage of this group could be interested in buying the book &#x2013; assuming they know about it? <em>Don&#x2019;t forget that publishers will invest into books that can generate decent sales: it&#x2019;s good to do a little research to help confirm your title could be one of these!</em></p><h4 id="shortlist-publishers-you-would-be-interested-working-with">Shortlist publishers you would be interested working with</h4><p>While there are quite a few publishers out there: what are your top preferences? And what are ones you&#x2019;re willing to consider, even if your &#x201C;top&#x201D; choices turn you away?</p><p>Self-publishing is always an option (I&#x2019;ll cover more on how I went about this in later parts). However, going with a good publisher can significantly speed up your book production, while also improving the quality.</p><h4 id="write-a-draft-table-of-contents-and-a-draft-chapter">Write a draft table of contents and a draft chapter</h4><p>Some publishers will want to look at what a draft chapter will look like &#x2013; but not all of them. Still, I found it helpful to do writing before submitting to a publisher. If for no other reason, this was to confirm that I&#x2019;d enjoy longform writing!</p><p>I spent about a week putting together a table of content, and around four months writing drafts of chapters. These chapters turned out to be helpful later on.</p><h4 id="submit-a-tailored-pitch-your-the-publishers">Submit a tailored pitch your the publisher(s)</h4><p>Once you identified your top publisher choices, submit a pitch. Most book publishers have a pitch document they want you to follow. Here are common ones:</p><p><a href="https://www.oreilly.com/work-with-us.html?ref=blog.pragmaticengineer.com"><u>O&#x2019;Reilly&#x2019;s pitch template:</u></a></p><ul><li>Description</li><li>About the topic</li><li>Audience</li><li>Keywords</li><li>Competing titles</li><li>Related O&#x2019;Reilly titles</li><li>Book outline</li><li>Writing schedule</li></ul><p><a href="https://www.manning.com/write-for-us?ref=blog.pragmaticengineer.com"><u>Manning&#x2019;s pitch template:</u></a></p><ul><li>About the author</li><li>About the book topic</li><li>The book plan</li><li>Q&amp;A</li><li>Reader overview</li><li>Book competition</li><li>Book length and illustrations</li><li>Writing schedule</li><li>Table of contents</li></ul><p><a href="https://pragprog.com/publish-with-us/resources/PragProg_Proposal_Template.txt?ref=blog.pragmaticengineer.com"><u>The Pragmatic Bookshelf template:</u></a></p><ul><li>Overview</li><li>Outline</li><li>Bio</li><li>Competing books</li><li>PragProg books</li><li>Market size</li><li>Promotional ideas</li><li>Writing samples</li></ul><p>Most of these templates ask for similar content, so if you completed one pitch: the others are much easier. Here are some tips I&#x2019;d have for building a pitch.</p><p><strong>Put yourself in the shoes of the publisher. </strong>This book is a <em>huge</em> deal to you: but it&#x2019;s just one of the dozens that the publisher will publish <em>just</em> this year. You want to write an <em>amazing</em> book: but the publisher wants to publish one that <em>will sell</em>.</p><p>And these are major differences! The publisher will care very much about competition for the book, and how their existing titles relate to them. Like a VC firm, a publisher will not want to fund two investments competing on the exact same market: so if the publisher recently published a book that is a deepdive on Go; they will almost certainly pass on the next one, no matter how good your pitch is.</p><p><strong>Pitching to several publishers parallel is totally fine and you should do it! </strong>This is one thing I wish I&#x2019;d done differently.<strong> </strong>In my mind, I was 100% certain that my first publisher-of-choice would jump on the opportunity to publish this book. I thus felt that it would be &#x201C;unfair&#x201D; if I pitched to other publishers, without hearing back.</p><p>In hindsight, as a first-time author, this strategy was a waste of time on my end. Most publishers are unlikely to take a risk on a first-time author with no books published in the past &#x2013; like I was in 2019. And so the likely outcome is rejection in most cases.</p><p>In my case, I spent about two and a half months waiting on the response from this first publisher. My acquisitions editor was championing the book &#x2013; making the case for the publisher to offer a contract &#x2013; but in the end, the publisher chose another book with a similar topic that was in their pipeline. This made perfect business sense for them &#x2013; but for me, I was spent waiting for months, instead of pitching the book to other publishers!</p><p><strong>My book pitch ended up being a helpful resource on my self-publishing journey. </strong>Even though I did not release with a publisher: pitching to publishers helped the book become an eventual success. It was for these reasons:</p><ul><li><strong>Defining the structure.</strong> I had my table of contents well thought-out by the time I submitted the pitch. This structure changed later, but it was a solid start.</li><li><strong>Positioning the book.</strong> I had a good idea of the &#x201C;competitive&#x201D; landscape, and what books my title would &#x201C;go up against.&#x201D; It also helped me focus on how my book is different to what is already out there.</li><li><strong>Forcing me to think about marketing. </strong>The Pragmatic Bookshelf asked for a section on promotional ideas. This forced me to think about where (and how) I would promote the book &#x2013; even before getting into the thick of writing. When going with a publisher, it&#x2019;s safe to assume that the publisher&#x2019;s brand will do some marketing. However, authors will still do the lion&#x2019;s share of marketing &#x2013; and it&#x2019;s good to think about this ahead of time.</li></ul><h2 id="5-working-with-a-publisher">5. Working with a publisher</h2><p>I got lucky with one of the three publishers, in the end. This publisher was looking for a book just like mine, right at that time! What happened was one of their best sellers had to be pulled from publication, for reasons outside the control of the publisher. Apparently, when my pitch arrived, they had just started a search for a book that could plug the hole &#x2013; and they saw my book being a perfect fit for a &#x201C;software career advice&#x201D; book.</p><p>At the time, this felt like great luck. In hindsight, my relationship with the publisher might have soured exactly because they were looking for me to write <em>a specific kind of book</em> that would be similar enough to this old book &#x2013; but I had no intention of doing so. <em>More on how things went sour in the section after this one.</em></p><p>From signing the contract, I worked with a publisher for about a month &#x2013; so I&#x2019;m not exactly the most experienced in this front. However, a couple of things stood out as strong positives &#x2013; and things that I &#x201C;lost&#x201D; when deciding to self publish, in the end.</p><p><strong>Strong pressure to write &#x2013; thanks to the contract. </strong>My contract had pretty strict deadlines included. We signed it on 11 January 2020, and these deadlines were part of the contract:</p><p>&#x201C;The Author shall prepare and deliver to the Publisher a machine-readable electronic copy of the manuscript for the Work, including all its illustrations, code listings, and exercises, as mutually agreed upon by the Publisher and the Author as follows:</p><p>- Not later than March 15, 2020, a partial manuscript for the Work totaling not less than one third of the planned finished Work.</p><p>- Not later than June 1, 2020, a partial manuscript for the Work totaling not less than two thirds of the planned finished Work.</p><p>- Not later than August 15, 2020, a draft of the complete manuscript for the Work suitable for review.</p><p>- Not later than September 1, 2020, the final, revised and complete manuscript for the Work acceptable to the Publisher for publication.&#x201D;</p><p>Talk about pressure! Also, my first payout was tied to reaching the first milestone &#x2013; which was delivering at least a third of the finished work. My publisher also set up regular check-ins to help me stay accountable. And this kind of pressure was good &#x2013; because without it, I would have pushed back writing, or got stuck on relatively trivial parts!&#xA0;</p><h2 id="6-breaking-up-with-the-publisher">6. Breaking up with the publisher</h2><p>While I greatly appreciated that a publisher took a chance on me, lots of things felt wrong from the start. A month into working together, I felt that things were getting worse, and not better.</p><p>The small things that I dismissed, in the beginning:</p><ul><li><strong>A (very) opinionated structure.</strong> This publisher had strongly opinionated templates I was told to use for all chapters. They included each chapter to start by stating what the reader will learn; and then summarize this at the end of the chapter. It wasn&#x2019;t how I imagined my book to be &#x2013; but it didn&#x2019;t seem I had a choice. I figured, I&#x2019;ll give it a go. The publisher knows better after all, as they&#x2019;ve done this hundreds of times. <em>Right</em>?</li><li><strong>Needing to ask for permission to share drafts on social media.</strong> I originally planned to share screenshots of some of the parts I am writing to get feedback as I go &#x2013; and to also increase visibility of the book. I thought that this is a no-brainer. Not only does this kind of &#x201C;early sharing&#x201D; makes the book better: but it will also make more people excited about the book, leading to more eventual customers. To my surprise, my contact at the publisher said I will need to ask for permission whether I can do this. Permission? For something that will market the book? Yes: because the publisher owns all publishing rights, including for the draft!</li><li><strong>I won&#x2019;t decide on what the title will be.</strong> I had strong opinions about what I&#x2019;d like the book&#x2019;s title to be. My publishing contact also had ideas on what they thought would be good to add to it &#x2013; like introducing the &#x201C;mentoring&#x201D; term either to the title or the subtitle: which was an idea I disliked. As I talked with them, it became clear that the publisher will set the final title: not me. Hmm &#x2013; odd, no? It&#x2019;s another reminder that, although it&#x2019;s my book: it&#x2019;s <em>really</em> the publisher&#x2019;s book, and they have the final say on all important decisions.</li><li><strong>Nudges to &#x201C;dumb down&#x201D; the book. </strong>My editor was giving more suggestions on how to edit the content to make it more &#x201C;beginner-friendly&#x201D; and suggested I introduce e.g. &#x201C;Alice and Bob&#x201D; examples to make it easier to digest the contents. <em>One of the recently best-selling books of the publisher heavily used Alice and Bob, and it seems the publisher thought it helped their sales.</em></li></ul><p><strong>The first major editorial review was where I decided we should part ways with the publisher. </strong>About a month-and-a-half in, the publisher pulled together several experienced editors, and offered suggestions on how I could improve the book. The suggestions were these:</p><ul><li><strong>Focus on reader engagement. </strong>Tell stories and develop them with emotion, mystery, aha moments, and unexpected conclusions. Tell the stories from the &quot;we&quot; or &quot;they&quot; perspective -- make stories team-oriented.</li><li><strong>Exercises.</strong> Develop exercises for use within the chapters (not just end) or a story about what happened when one person did the exercise.</li><li><strong>Mini-projects.</strong> Guide readers to discover and come to conclusions on their own (see Donald Saari story in What the Best College Teachers Do). Mini project topics: testing, architectures.</li><li><strong>Word of the day feature.</strong> Example: Dependency injection (what is it)? Scatter these across the book.</li><li><strong>Quotes.</strong> Include quotes from luminaries such as [Well-known-person 1] and [Well-known-person 2] that relate to advice given. Ask other [Publisher] authors to relate experience about how they followed similar advice and were successful.</li><li><strong>Tech map. </strong>Create a diagram of the current technology landscape. Example big-picture topics: architecture demystified, distributed systems demystified.</li></ul><p>While I appreciated the suggestions: I <em>hated</em> all of them. I saw what implementing them would do: they would turn this book &#x2013; which I already had reservations with the &#x201C;forced&#x201D; style on me &#x2013; to something I would <em>not</em> want to read. Much less write!</p><p>I envisioned writing a more matter-of-the-fact book that doesn&#x2019;t have exercises, &#x201C;mini projects&#x201D; or &#x201C;word of the day&#x201D; gimmicks.</p><p><strong>I sat down to reflect why I chose to work with a publisher, to start with.</strong> As an author, I&#x2019;m giving up a lot of things: editorial control, the bulk of revenue, all publishing rights&#x2026; and for what? For the publisher to make the process easier, and for the end result book to be better than if I was working alone.</p><p>But I felt that this book would be far <em>worse</em> if I continued with my publisher: and the only way to get it back to what I envisioned was if I spent a lot of time and energy pushing back on them.</p><p>It would cost me less energy to self-publish. So I decided to terminate my agreement because it didn&#x2019;t feel my publisher was helping write the book that I wanted to write.</p><p><strong>My publisher was understanding and professional in terminating the contract.</strong> I explained to them that all the feedback suggested they wanted to see a very different book to what I wanted to write. And that, frankly, I am not the author to write <em>that</em> kind of book.</p><p>Truth be told, I was embarrassed that I had wasted their resources &#x2013; working with their development editor and the editing team &#x2013; for these two months. At the same time, I was vocal in voicing to my editor that I was hesitant about this mandated style. I also made the decision that there is no point in continuing at the first <em>formal</em> feedback session. I&#x2019;m not sure I could have come to this conclusion any further, as I was still learning how this book publisher worked, up until that point.</p><p>To show how professional this team was, this is the termination letter they sent as a signed PDF:</p><p>&#x201C;This letter is in reference to our Publishing Agreement with you for [what would become The Software Engineer&#x2019;s Guidebook] dated January 11, 2020. By mutual agreement, we are terminating the publishing contract.</p><p>Since no advance was paid to you under the terms of this contract, all rights in the content you originally submitted will hereby return to you and we will consider this matter concluded.</p><p>The decision to cancel a project is never an easy one to make. We thank you for all the efforts on this project that you made and wish you the best in your future endeavors.&#x201D;</p><p><strong>At this point, I learned enough about publishers and myself to decide: I&#x2019;m doing it by myself. </strong>Having my book accepted by a major publisher gave external validation that there&#x2019;s a strong business case for The Software Engineer&#x2019;s Guidebook. And working with an opinionated publisher &#x2013; and continuously pushing back on styling suggestions made me realize that I already have my own opinonated style that I <em>like</em> using.</p><p>I did lose a very important thing by deciding to self-publish: the accountability of meeting a publishing deadline. Working with the publisher, this book would have been out fall 2020 or spring 2021. Self-publishing, I launched it November 2023.</p><p>One of the reasons for publishing my book two years later than it would have taken with a publisher was because I now <em>knew</em> I could no longer rely on a well-known publisher to lend my book their brand. For my book to have an even slim chance of being successful: I would have to compensate for the lack of being associated with a publisher, and fill the gap in marketing and awareness, leading up to the book launch.</p><p>Not having a publisher was a reason I started writing <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><u>The Pragmatic Engineer Newsletter</u></a> in August 2021 (a year-and-a-half after breaking up with this publisher) &#x2013; and the sudden success of this newsletter gave me less time to wrap up the book. At the same time, by the time the book was ready, there were plenty of people who looked forward to reading it: and many of them were already readers of the newsletter!</p><p>I&#x2019;ll cover more about how I went about the actual self-publishing process in a follow-up article, how the book ended up selling, and other learnings. Subscribe <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><u>to The Pragmatic Engineer</u></a> to get notified when it is out.</p><p></p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Amazon layoffs – AI or economy to blame?]]></title><description><![CDATA[Amazon is doing more mass layoffs, claiming it wants to be more nimble. But are job losses really about US economic fears, and how Amazon’s retail business will be affected?]]></description><link>https://blog.pragmaticengineer.com/the-pulse-amazon-layoffs/</link><guid isPermaLink="false">690ccf1cece43400015a8f22</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 06 Nov 2025 16:40:34 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/" rel="noopener noreferrer nofollow"><em> an email you could send to your manager</em></a><em>.</em></p><hr><p>Online retail giant Amazon unexpectedly announced 14,000 job cuts earlier last week. The massive round of layoffs at the company follows other mass redundancies in recent years:</p><ul><li><strong>January 2023</strong>: 18,000 people <a href="https://newsletter.pragmaticengineer.com/i/70995338/amazon-to-lay-off-more-people-and-rescind-more-offers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">laid off</a>.</li><li><strong>March 2023</strong>: another 9,000 people <a href="https://www.businessinsider.com/amazon-layoffs-second-round-9000-job-jobs-2023-3?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a></li><li><strong>November 2023</strong>: hundreds of people <a href="https://www.businessinsider.com/amazon-layoffs-memo-hundreds-job-cuts-alexa-agi-team-2023-11?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a> inside the Alexa team, as Amazon was looking to shift Alexa more toward GenAI</li><li><strong>April 2024</strong>: hundreds of people <a href="https://www.businessinsider.com/amazon-job-cuts-aws-roles-cloud-computing-division-aws-2024-4?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a> from AWS</li></ul><p>Software engineers, unfortunately, seem hit hard by the latest layoffs: of 2,300 employees laid off in Washington State, 25% are software engineers, GeekWire <a href="https://www.geekwire.com/2025/amazon-layoffs-hit-software-engineers-hardest-in-washington/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reports</a>. <em>We can only speculate about the ratio across the rest of the company, but if cuts at HQ are heavy on engineering, then things don&#x2019;t look promising for other locations, sadly.</em></p><p><a href="https://www.aboutamazon.com/news/company-news/amazon-workforce-reduction?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The memo</a> from Beth Galetti, Senior Vice President of People Experience and Technology, to workers didn&#x2019;t explain much:</p><blockquote>&#x201C;Some may ask why we&#x2019;re reducing roles when the company is performing well. Across our businesses, we&#x2019;re delivering great customer experiences every day, innovating at a rapid rate, and producing strong business results. What we need to remember is that the world is changing quickly. This generation of AI is the most transformative technology we&#x2019;ve seen since the Internet, and it&#x2019;s enabling companies to innovate much faster than ever before (in existing market segments and altogether new ones). We&#x2019;re convinced that we need to be organized more leanly, with fewer layers and more ownership, to move as quickly as possible for our customers and business.&#x201D;</blockquote><p>The statement is utterly confusing, as encapsulated by its message that &#x201C;business is great, but we need to do layoffs&#x201D;. Job cuts usually mean a business is in trouble, which obviously isn&#x2019;t the case for Amazon. So, why are these layoffs <em>really</em> happening?</p><h3 id="layoffs-to-boost-efficiency">Layoffs to boost efficiency?</h3><p>The company&#x2019;s memo states:</p><blockquote>&#x201C;We&#x2019;re convinced that we need to be organized more leanly, with fewer layers and more ownership, to move as quickly as possible for our customers and business.&#x201D;</blockquote><p>If this line of reasoning sounds familiar, it&#x2019;s because most of the layoffs in 2023 were justified the same way. The tech industry overhired during the pandemic in 2020-2021, making orgs more bloated and decision-making slower. In February 2023, I reported on <a href="https://newsletter.pragmaticengineer.com/p/the-scoop-38?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the trend of fewer middle managers</a>, with Meta the first Big Tech giant to reduce its management layers. In 2023, most of Big Tech followed this approach with layoffs or reorgs. Managers acquired more reports, and tech companies cut down the number of layers between the CEO and individual contributors.</p><p>Given Amazon did other massive layoffs in 2023, it&#x2019;s unlikely they missed the industrywide trend for fewer managers. While the current layoffs seem to be targeting managers quite a bit &#x2013; from the Washington State layoffs, 20% of those let go <a href="https://www.geekwire.com/2025/amazon-layoffs-hit-software-engineers-hardest-in-washington/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">are managers</a> &#x2013; there are still more ICs laid off than managers, overall. So, this official explanation doesn&#x2019;t pass my personal &#x201C;smell test&#x201D;.</p><h3 id="layoffs-to-buy-more-gpus">Layoffs to buy more GPUs?</h3><p>The day after its jobs announcement, Amazon had more big news, this time about AI: it unveiled Project Rainer, the largest AI computing platform AWS has ever built. It already has 500,000 Trainium2 chips (built by Amazon), This capacity is already 70% larger than any AI computing platform in AWS&#x2019;s history, and Anthropic is using all of it (!!) to train its next models. Below is an image of one of the several Project Rainer data centers packed with Amazon GPUs:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp" class="kg-image" alt loading="lazy" width="1082" height="1072" srcset="https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w600/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 600w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/size/w1000/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 1000w, https://storage.ghost.io/c/39/f8/39f85cc7-8637-40fc-a57c-f45754453717/content/images/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 1082w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The next generation of Claude models is trained in these data centers. Source: </em></i><a href="https://x.com/ajassy/status/1983616724642730217?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Amazon</em></i></a></figcaption></figure><p>Building data centers is incredibly capital-intensive: Amazon has <a href="https://www.cnbc.com/2025/10/29/amazon-opens-11-billion-ai-data-center-project-rainier-in-indiana.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">spent</a> $11B on Project Rainer alone. Even though very profitable, Amazon might want to invest <em>more</em> cash than it currently has into building data centers. So, one reason for job cuts could be to reallocate financial resources from paying salaries and compensation towards building more data centers.</p><p>Before doing the math, a couple of concepts are important to understand:</p><ul><li><strong>Free cash flow (FCF)</strong>: profit after payment for things like capital expenditure (CapEx), such as financing data centers. If a company wants to operate with as little debt as possible, FCF is usually very important. If Amazon wants to avoid loans and not touch its reserves, then data center investment would come from FCF, reducing FCF further.</li><li><strong>Cash reserves: </strong>a company&#x2019;s <em>liquid</em> reserve investments, usually an accumulation of investments in financial instruments like bonds and securities, or cash deposits.</li></ul><p>Let&#x2019;s run Amazon&#x2019;s numbers:</p><ul><li><strong>Cash reserves: $93B. </strong>This is how much Amazon has in reserve.</li><li><strong>FCF: $32B. </strong>This is the rough free cash flow Amazon has currently, as per its latest quarterly report. This is after deducting <em>current</em> data center investments.</li><li><strong>Savings from layoffs: $2-4B. </strong>This is my estimate of the rough total compensation of 14,000 employees.</li></ul><p>So, the savings from these layoffs wouldn&#x2019;t even pay for half of Project Rainer ($11B in total), and Amazon could easily build 3x Project Rainers in the next year, without needing to dip into its savings! Of course, Amazon has its famous frugality principle, but this massive layoff of 14,000 people won&#x2019;t make a big difference to how much it can invest in data centers; It can already spend much more, if it wants!</p><h3 id="leanness-and-ai-fail-job-cuts-%E2%80%9Csmell-test%E2%80%9D">Leanness and AI fail job cuts &#x201C;smell test&#x201D;</h3><p>It&#x2019;s not only me who doesn&#x2019;t buy the explanation that these layoffs are to streamline the company, or to redirect resources to AI. <a href="https://www.linkedin.com/in/arneknudson/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Arne Knudson</a> worked at Amazon for nearly two decades, most recently as a software development manager (SDM), before leaving the company earlier this year. He <a href="https://www.linkedin.com/posts/arneknudson_in-my-18-years-at-amazon-i-went-through-activity-7388737736590909440-wJ1M?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAAIk0KwBsmE3oBadWSg2ettxmEyKbqZKG34" rel="noopener noreferrer nofollow">shared</a> his analysis, with some insider detail:</p><blockquote>&#x201C;In my 18 years at Amazon, I went through a few layoffs and hiring freezes.<br><br>This is the first time I&#x2019;ve seen multiple years of significant layoffs essentially back-to-back. Even in the depths after the .com bubble, it wasn&#x2019;t this bad. They&#x2019;ve been laying people off now for almost 3 straight years.<br><br><strong>The explanation that this is downsizing after hiring too many at the height of the pandemic doesn&#x2019;t pass the smell test, at least to me. </strong>That was 3 years ago; they&#x2019;re not that dumb to keep those people around for 3 extra years. Those folks were laid off back in &#x2018;22.<br><br><strong>I&#x2019;m also not convinced that this is optimization due to AI.</strong> My degree&#x2019;s in AI, and I worked on AI stuff at Amazon; I don&#x2019;t think there&#x2019;s enough automation yet, and it&#x2019;s not accurate enough yet, to replace 30,000 people. The cost of inaccuracies seems too high. But I could be wrong; maybe they&#x2019;ve gotten their false negative &amp; false positive rates low enough to avoid too many region-wide AWS outages. (Or not.)<br><br>One of the articles I read said this was going to be in HR, and I can tell you as a former manager, my experience working with HR had been steadily worsening over the past 5-7 years. They outsourced so much of the work, overworked the people they had, and had such high turnover that I never knew who I was supposed to work with. When I needed to put someone on a performance plan or help a new hire receive some kind of accommodation, it seemed like it was a different person each time. If they really are laying off tens of thousands more HR folks, this is only going to get worse.<br><br><strong>And, I suspect, it means they don&#x2019;t plan on hiring MORE people in any of the business units for a year or more. </strong>So, by the smell-o-meter, this seems more significant than streamlining the workforce, improved AI, and &#x201C;nah, we don&#x2019;t need as many HR folks.&#x201D;</blockquote><h3 id="us-economy-to-blame-for-amazon-layoffs">US economy to blame for Amazon layoffs?</h3><p>It&#x2019;s safe to assume AWS as a business unit is doing just fine, as suggested by Project Rainer&#x2019;s existence and the agenda for building data centers. But how is the e-commerce side of the business performing, and what&#x2019;s its outlook?</p><p>If one business should have its finger on the pulse of the US economy, it&#x2019;s Amazon with its size and self-professed, relentless customer focus, providing a window into people&#x2019;s spending habits across the country. Flashing lights on the dashboard of the national economy may signal tough times ahead in e-commerce, which could be a reason to start cutting costs early.</p><p><strong>There are concerning signs from other sectors about the US economy. </strong>Below is the CEO of the restaurant chain, Chipotle.</p><blockquote>&#x201C;Earlier this year, as consumer sentiment declined sharply, we saw a broad-based pullback in frequency across all income cohorts.<br><br>Since then, the gap has widened, with low to middle-income guests further reducing frequency. We believe that this guest with household income below $100,000, represents about 40% of our total sales. And, based on our data, they are <strong>dining out less often due to concerns about the economy, and inflation.</strong><br><br>A particularly challenged cohort is the 25- to 35-year-old age group. We believe that this trend is not unique to Chipotle and is occurring across all restaurants as well as many discretionary categories. This group is facing several headwinds, including unemployment, increased student loan repayment and slower real wage growth. We tend to skew younger and slightly over-indexed to this group relative to the broader restaurant industry&#x201D;.</blockquote><p>Chipotle is saying that everyone is eating out less, particularly 25-35 year olds, because of inflation. If people spend less on Chipotle because of rising prices, then they may also spend less in other areas of their lives for the same reason, including on Amazon.</p><p>In the e-commerce supply chain, there&#x2019;s evidence of this trend, which would mean delivery services like UPS have fewer parcels to deliver. Speaking of UPS, two days ago, it <a href="https://nypost.com/2025/10/28/business/ups-axes-48000-workers-in-sweeping-cost-cut-push-sparking-stock-surge/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">announced</a> massive layoffs:</p><blockquote>&#x201C;United Parcel Service (UPS) has slashed 48,000 jobs this year &#x2014; one of the largest single-year reductions by a US company since the pandemic &#x2014; as the package giant scrambled to contain costs and revive its lagging stock price.<br><br>The Atlanta-based delivery behemoth disclosed the reductions Tuesday while reporting third-quarter earnings that beat Wall Street expectations.<br><br>UPS said 34,000 of the cuts hit drivers and warehouse operations, while 14,000 targeted management (...)<br><br>UPS shares jumped nearly 9% in Tuesday afternoon trading, even as the company reported weaker revenue and profits&#x201D;.</blockquote><p>UPS&#x2019;s revenue is down on last year, which suggests that there are, indeed, fewer deliveries (or lower value ones). As with the latest job cuts at Amazon, these drastic layoffs could be explained by a lot of things, most easily by UPS expecting reduced trade in the future.</p><p><strong>If US consumer spending is trending down, then the e-commerce sector will be among the first to feel this. </strong>It could explain why Amazon is making these layoffs now. It can also explain why Google, Meta, and Microsoft might not be seeing their businesses impacted: they&#x2019;re not involved in retail like Amazon is, and the AI sector <em>is</em> very much booming.</p><p>Among all of Big Tech, Amazon is best positioned to detect changes in US consumer spending. Google&#x2019;s and Meta&#x2019;s revenue is more dependent on advertising, and Microsoft&#x2019;s more on enterprise spend. Like Amazon, Apple is well placed to feel market changes with its range of smartphones and watches, and other consumer tech.</p><p>I believe Amazon is highly commercially rational, so it&#x2019;s worth understanding the <em>actual</em> reason for its second major mass layoffs in just two years, following deep cuts in 2023. I&#x2019;d put my money on this reason being the economy, and how Amazon probably expects customers to cut back their spending everywhere, including on Amazon.</p><hr><p>This was one out of the four topics covered in last week&#x2019;s The Pulse. <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full article.</a></p><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-150-cursor-and-github-double?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>This week&#x2019;s The Pulse</strong></a> additionally covers:</p><ol><li><strong>Cursor and GitHub double down on agents. </strong>Each company is focusing on agents: Cursor with its multi-agent mode, and GitHub with its &#x201C;Agent HQ.&#x201D; Cursor is increasingly a direct rival to GitHub.</li><li><strong>Industry pulse. </strong>Meta rolls out AI-assisted interviews, Cursor and Windsurf believed to be using Chinese open source AI models, South Korean government pays price of ignoring backup &#x201C;101,&#x201D; startups growing much faster in the US than in Europe, companies using AI tools buy more JIRA seats, a neat uptime service called Updog, and more.</li><li><strong>OpenAI inflating the bubble? </strong>OpenAI signs another massive deal with AWS based on predicted growth, and seeks taxpayer protection to borrow more.</li><li><strong>Large tech companies struggle to build their AI integrations</strong>. Apple admits failure to modernize Siri by paying Google $1B per year for its LLM. Perplexity to pay Snap $400M to be its AI search interface.</li><li><strong>How much do Directors of Engineering earn at startups?</strong> Data from Carta says it&#x2019;s more than any other Director: $215-230K at companies valued at $25-250M.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-150-cursor-and-github-double?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a></p>]]></content:encoded></item><item><title><![CDATA[Comparing interviews at 8 large tech companies]]></title><description><![CDATA[Puneet Patwari applied to 8 major tech companies, and received 6 offers. He compares his interview experiences at Meta, Amazon, Uber, and 5 other workplaces]]></description><link>https://blog.pragmaticengineer.com/comparing-interviews-at-8-large-tech-companies/</link><guid isPermaLink="false">6903a79017b0a200016fa3a2</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 30 Oct 2025 18:00:42 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one topic from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The Pulse #149</em></a><em>. Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p><a href="https://www.linkedin.com/in/puneet-patwari/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Puneet Patwari</a> recently accepted an offer to join Atlassian as a Principal Software Engineer. In three months, he did more than 60 interviews at 11 companies, he told me &#x2013; while dropping out of 3 more interview processes after accepting the Atlassian offer, including that of Meta. Following that endeavour, he has <a href="https://www.linkedin.com/posts/puneet-patwari_i-interviewed-at-google-uber-walmart-amazon-activity-7379375609354883072-Rkq1/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">compared</a> the interview processes of the largest companies:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!fd6d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F411215af-a63b-411f-9192-d6a7ef71481e_1390x1236.png" class="kg-image" alt loading="lazy" width="1390" height="1236"><figcaption><i><em class="italic" style="white-space: pre-wrap;">What each interview process was like. Source: </em></i><a href="https://www.linkedin.com/posts/puneet-patwari_i-interviewed-at-google-uber-walmart-amazon-activity-7379375609354883072-Rkq1/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Puneet Patwari</em></i></a></figcaption></figure><p>A few more observations that Puneet shared with me:</p><blockquote><strong>Amazon</strong>: the Amazon Hiring Manager round was one of the most unique I ever experienced. We got so engrossed in the discussion that it took 160 minutes instead of the scheduled 60 minutes! We had to take a break in between the interview process.<br><br><strong>Atlassian</strong>: The leadership craft (LC) &amp; values were two interview rounds which were very crucial in determining that I&#x2019;ll be levelled at the Principal level. Of course, the Systems Design interview was also key here. Atlassian puts a lot of emphasis on LC for Principal engineers.<br><br><strong>Salesforce</strong>: the system design round was based on the <em>actual</em> job requirement. It was a migration problem where the interviewer wanted to check if I can own a project end-to-end with customers at the centre of it.<br><br><strong>Confluent</strong>: when I say it was the most mentally demanding interview, what I mean is how every skill was tested with two interviews! So 2x data structures and algorithms (DSA), 2x System Design 2x behavioural interview rounds.<br><br><strong>I cannot stress enough how important behavioural interviews are at the Staff+ levels. </strong>Doing well on these interviews were decisive in getting Staff and Principal-level offers. Of course, you needed to do well on coding and systems design: but my sense was that the behavioural parts were make or break for levelling and getting an offer.</blockquote><p>A few things stand out to me from Puneet&#x2019;s account of his interviews at leading tech companies:</p><ul><li><strong>Algorithmical coding interviews are everywhere! </strong>For senior+ positions, you need to get really good at these, including challenging topics like dynamic programming. We cover how to perform well in these in the article, <a href="https://newsletter.pragmaticengineer.com/p/how-to-get-unstuck-during-coding-interviews?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">How experienced engineers get unstuck in coding interviews</a></li><li><strong>Interviews are tough, and time consuming. </strong>Even after Puneet had offers, no company shortened their process. Puneet had to decline 3 more interviews &#x2013; including one at Meta &#x2013; because by the time the interviews would have come around, he already had an offer he had accepted at Atlassian.</li><li><strong>In a tough job market, &#x201C;top&#x201D; candidates are still in demand. </strong>We&#x2019;ve covered <a href="https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025-hiring-managers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">how challenging the current tech labor market is for jobseekers</a>, but Puneet interviewed at 11 companies and got 6 offers. His applications had to have a lot going for them in order to pass the resume screenings: 10+ years of experience, and working as a Senior Software Engineer at Microsoft. He also showed up <em>really</em> well prepared.</li><li><strong>Bad luck can strike at any time</strong>. Puneet&#x2019;s interview experience at Uber seems to have been a bit unlucky: the interviewer presented as rigid and not open to dialogue. Perhaps they were having a tough day, or wanted to get the interview over with. Or it could be what Steve Yegge describes as the <a href="https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">interviewer anti-loop</a></li></ul><p>Congrats to Puneet for accepting the Atlassian position, and thanks for sharing all these learnings!</p><hr><p>This was one out of five topics covered in <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Pulse #149</a>. The full edition additionally covers:</p><ul><li><strong>New trend: programming by kicking off parallel AI agents. </strong>More devs are experimenting with kicking off coding agents in parallel</li><li><strong>ACP protocol.</strong> A new protocol built by the Zed team, which tries to make it easier to build AI tooling for IDEs than the MCP protocol allows</li><li><strong>AI security tooling works surprisingly well?</strong> AI-powered security tools seem good at identifying security flaws in mature open source projects</li><li><strong>Is AI the only engine of US economic growth?</strong> Forty percent of US GDP this year is based on AI-related spend, while 60% of venture capital goes into AI. Hopefully, it won&#x2019;t end up as a bubble which bursts like in 2001</li></ul><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a>, and check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">today&#x2019;s The Pulse here</a>.</p>]]></content:encoded></item></channel></rss>