<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[The Pragmatic Engineer]]></title><description><![CDATA[Observations across the software engineering industry.]]></description><link>https://blog.pragmaticengineer.com/</link><image><url>https://blog.pragmaticengineer.com/favicon.png</url><title>The Pragmatic Engineer</title><link>https://blog.pragmaticengineer.com/</link></image><generator>Ghost 6.20</generator><lastBuildDate>Wed, 04 Mar 2026 12:01:36 GMT</lastBuildDate><atom:link href="https://blog.pragmaticengineer.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[I replaced a $120/year micro-SaaS in 20 minutes with LLM-generated code]]></title><description><![CDATA[ I used to pay $120/year for a SaaS that hasn’t added new features in four years, and didn’t fix its broken billing system for three years. Using an LLM, I managed to rewrite all the functionality I used to pay for in 20 minutes. Is this bad news for “write once, don’t update later” SaaS?]]></description><link>https://blog.pragmaticengineer.com/i-replaced-a-120-year-micro-saas-in-20-minutes-with-llm-generated-code/</link><guid isPermaLink="false">697ba13c7779050001e3775d</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 29 Jan 2026 18:41:45 GMT</pubDate><content:encoded><![CDATA[<p>I have been sceptical of the manifold claims that software-as-a-service (SaaS) will be killed by LLMs. The theory behind this idea is:</p><ol><li>SaaS is a pure software product. People who pay SaaS vendors do so because it&#x2019;s cheaper to buy this software than build it.</li><li>LLMs dramatically reduce the time and cost of building custom software.</li><li>Therefore, most SaaS vendors will go out of business because most companies/teams will prompt an LLM to write the software they need, such as for ticketing, meetings, customer relationship management, etc.</li></ol><p>The reason for my scepticism has been that SaaS such as HR software Workday is&#xA0;<em>more</em>&#xA0;than just software. Workday, for example, keeps up with compliance requirements (e.g., for holiday pay in different countries), guarantees correctness (e.g., payslips that comply with local regulations), and over time the software keeps up to date with changes in the external and internal environments.</p><p><strong>However, this week I had first-hand experience of how ridiculously easy it is now to replace SaaS with LLMs.&#xA0;</strong>On my website &#x2013;&#xA0;<a href="http://pragmaticengineer.com/?ref=blog.pragmaticengineer.com">pragmaticengineer.com</a>&#xA0;&#x2013; I have a testimonials section, which displays real LinkedIn and X posts about this publication. It cost $120/year for a small service called&#xA0;<a href="https://shoutout.io/?ref=blog.pragmaticengineer.com">Shoutout.io</a>, and looked like this:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2026/01/image.png" class="kg-image" alt loading="lazy" width="1390" height="1120" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2026/01/image.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2026/01/image.png 1000w, https://blog.pragmaticengineer.com/content/images/2026/01/image.png 1390w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Testimonials, nicely collected and rendered by Shoutout</em></i></figcaption></figure><p>And this is the backend: nothing fancy, just a way to add, edit, reorganize, and delete testimonials.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2026/01/image-1.png" class="kg-image" alt loading="lazy" width="1456" height="922" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2026/01/image-1.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2026/01/image-1.png 1000w, https://blog.pragmaticengineer.com/content/images/2026/01/image-1.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Shoutout&#x2019;s admin interface</span></figcaption></figure><p>I was a customer for four years and logged in perhaps once a year. My latest login was to get an annual invoice for my expenses. Unfortunately, the billing section was broken, so I emailed support and they sent me a broken link instead of the invoice. This was frustrating: why pay for a SaaS with broken billing? I couldn&#x2019;t even tell what they would charge me next year.</p><p><strong>So I asked myself if I could rebuild my own use case with an LLM, and do it rapidly.&#xA0;</strong>My use case was much simpler than the SaaS itself:</p><ul><li>Display existing testimonials in a similar way</li><li>Make it easy to add new ones, e.g., store testimonials in some JSON format</li><li>Make it look good</li></ul><p>To my surprise, this whole effort from start to finish took exactly 20 minutes with Codex. The steps I took were straightforward enough:</p><ul><li>Asked Codex to make a plan on how to remove this third-party dependency and host all testimonials in my codebase (a GitHub repo, deployed onto Netlify)</li><li>Tweaked the plan: I pushed for a modular approach where testimonials are in a separate JSON file, and they get generated into HTML with a compile-time build step</li><li>Added this build step both locally and as a build trigger on Netlify</li><li>Tested the solution</li><li>Tweaked the UX and generated a schema</li><li>Deployed it</li></ul><p>The end result is visually the same as before, except I no longer have a third-party dependency rendering all of this!</p><h3 id="what-does-this-mean-for-saas-products-and-software-engineers">What does this mean for SaaS products and software engineers?</h3><p>What it means for software engineers:</p><ul><li><strong>Devs are (probably) a lot more comfortable using the command line for future updates than regular users.&#xA0;</strong>To add a future testimonial, I&#x2019;ll need to turn to my AI agent to insert it in my codebase, and I&#x2019;ll then need to verify that it looks good. This is not a big deal for me, but it might be a dealbreaker for someone not comfortable with verifying the code output of an LLM.</li><li><strong>It&#x2019;s a lot faster for a dev to &#x201C;port&#x201D; a SaaS than for anyone else.&#xA0;</strong>I first told Codex to copy the UI and it got things wrong because it tried to use a flexbox model. I had to tell it that this UI layout was not what I wanted, and then make the decision on which framework to use for the UI layout. A non-developer could probably figure all this out, but it would take longer.</li><li><strong>Honestly, it&#x2019;s fun and interesting to rewrite a third-party feature. I recommend it.&#xA0;</strong>Part of why I took on this project is because I expected it to be an interesting challenge. I thought the effort would be more than what it was, and I&#x2019;ve learned more about how well these tools work. I also used Codex in order to experience it more.</li></ul><p>What this could mean for SaaS software:</p><ul><li><strong>Rebuilding a SaaS still feels much harder than rebuilding&#xA0;<em>your specific</em>&#xA0;use case.&#xA0;</strong>I did not &#x201C;rebuild&#x201D; Shoutout in any way. Shoutout has 10x or more features, like adding quotes from 10 different platforms, authentication, billing (which didn&#x2019;t work for me), and more.</li><li><strong>A SaaS that doesn&#x2019;t give ongoing value is at risk of being replaced by customers.&#xA0;</strong>Shoutout doesn&#x2019;t provide ongoing value after it displays my testimonials, and this static nature means it&#x2019;s easy to replace. In contrast, it would be harder to rebuild if I paid for the platform to stay compliant, provide analytics or alerting, and do other real-time things that helped my business.</li><li><strong>Buying and selling SaaS businesses could become less profitable.&#xA0;</strong>The original version of Shoutout that I signed up for in 2021 was built in 2020 by an independent developer. In 2022, this developer&#xA0;<a href="https://www.indiehackers.com/post/my-startup-shoutout-has-been-acquired-0350ae659c?ref=blog.pragmaticengineer.com">sold this micro-SaaS</a>&#xA0;to a product studio. Then, in 2025, Shoutout&#xA0;<a href="https://x.com/davidsonkyle/status/1942207611006542317?s=20&amp;ref=blog.pragmaticengineer.com">was sold</a>&#xA0;again to new developers. From my point of view, nothing changed except that the billing system broke. I assume the buyers of this SaaS figured that revenue could keep rising with zero investment. But perhaps at some point that ceases to be true when people get fed up with a broken product and quit &#x2013; especially when doing so is cheaper.</li></ul><p><strong>&#x201C;Broken windows&#x201D; not being fixed is less acceptable than it used to be.&#xA0;</strong>My journey away from Shoutout began with its billing system being broken. For example, below is what I saw when I went to my billing section to see the invoices:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2026/01/image-2.png" class="kg-image" alt loading="lazy" width="1220" height="428" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2026/01/image-2.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2026/01/image-2.png 1000w, https://blog.pragmaticengineer.com/content/images/2026/01/image-2.png 1220w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">A trigger to quit: Billing had been broken since 2023 and was never fixed</span></figcaption></figure><p>As well as this, the customer support sent me a broken link in response to my email. That was enough for me to decide to replace this dependency, and I was surprised by how easy this was with an LLM and knowing what I wanted it to build.&#xA0;<em>By the time customer support sent me a working link two hours later, I had finished migrating off the SaaS.</em></p>]]></content:encoded></item><item><title><![CDATA[The grief when AI writes most of the code]]></title><description><![CDATA[When AI writes almost all code, what happens to software engineering? There is grief involved for us developers, that's for sure.]]></description><link>https://blog.pragmaticengineer.com/the-grief-when-ai-writes-most-of-the-code/</link><guid isPermaLink="false">695eab59af96490001536b9c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Wed, 07 Jan 2026 18:53:57 GMT</pubDate><content:encoded><![CDATA[<p>I&#x2019;m coming to terms with the high probability that AI will write most of&#xA0;<em>my</em>&#xA0;code which I ship to prod, going forward. It already does it faster, and with similar results to if I&#x2019;d typed it out. For languages/frameworks I&#x2019;m less familiar with, it does a better job than me.</p><p>It feels like something valuable is being taken away, and suddenly. It took a&#xA0;<em>lot</em>&#xA0;of effort to get good at coding and to learn how to write code that works, to read and understand complex code, and to debug and fix when code doesn&#x2019;t work as it should. I still remember how daunting my first &#x201C;real&#x201D; programming class was at university (learning C), how lost I felt on my first job with a complex codebase, and how it took years of practice, learning from other devs, books, and blogs, to get better at the craft. Once you&#x2019;re pretty good, you have something that&#x2019;s valuable and easy to validate by writing code that works!</p><p>Some of my best memories of building software are about coding. Being &#x201C;locked in&#x201D; and balancing several ideas while typing them out, of being in the zone, then compiling the code, running it and seeing that &#x201C;<em>YES&#x201D;,</em>&#xA0;it worked as expected!</p><p>It&#x2019;s been a love-hate relationship, to be fair, based on the amount of focus needed to write complex code. Then there&#x2019;s all the conflicts that time estimates caused: time passes differently when you&#x2019;re locked in and working on a hard problem.</p><p>Now, all that looks like it will be history.</p><p>I wonder if I&#x2019;ll still get the same sense of satisfaction from the fact that writing complicated code is&#xA0;<em>hard</em>? Yes, AI is convenient, but there&#x2019;s also a loss.</p><p>Or perhaps with AI agents, being &#x201C;in the zone&#x201D; will shift to thinking about higher-level problems, while instructing more complex code to be written?</p><hr><p>This was a section from my analysis piece <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com" rel="noreferrer">When AI writes almost all code, what happens to software engineering?</a>. Read the full one <a href="https://newsletter.pragmaticengineer.com/p/when-ai-writes-almost-all-code-what?ref=blog.pragmaticengineer.com" rel="noreferrer">here</a>.</p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare’s latest outage proves dangers of global configuration changes (again)]]></title><description><![CDATA[Deja vu: a large Cloudflare outage caused by an instantly rolled-out global config change – two weeks after a similar problem]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflares-latest-outage/</link><guid isPermaLink="false">69443c5d272393000120055e</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 18 Dec 2025 17:44:21 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><em><u>last week&#x2019;s The Pulse</u></em></a><em> issue. Full subscribers received the below article seven days ago. If you&#x2019;ve been forwarded this email, you can</em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em> <u>subscribe here</u></em></a><em>.</em></p><p>A mere two weeks after <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Cloudflare suffered a major outage</a> and took down half the internet, the same thing has happened again. Last Friday, 5th December, thousands of sites went down or partially down once more, in a global Cloudflare outage lasting 25 minutes.</p><p>As per last time, Cloudflare was speedy to share <a href="https://blog.cloudflare.com/5-december-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a full postmortem</a> on the same day. It estimated that 28% of Cloudflare&#x2019;s HTTP traffic was impacted. The cause of this latest outage was Cloudflare making a seemingly innocent &#x2013; but <em>global</em> &#x2013; configuration change that went on to take out a good portion of Cloudflare, <em>globally</em>, until being reverted. Here&#x2019;s what happened:</p><ul><li>Cloudflare was rolling out a fix for a nasty React security vulnerability</li><li>The fix caused an error in an internal testing tool</li><li>The Cloudflare team disabled the testing tool with a global killswitch</li><li>As this global configuration change was made, the killswitch unexpectedly caused a bug that resulted in HTTP 500 errors across Cloudflare&#x2019;s network</li></ul><p><strong>In this latest outage, Cloudflare was burnt by yet another global configuration change. </strong>The previous outage <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in November</a> happened thanks to a global database permissions change. In the postmortem of that incident, the Cloudflare team closed with this action item:</p><blockquote>&#x201C;Hardening ingestion of Cloudflare-generated configuration files in the same way we would for user-generated input&#x201D;</blockquote><p>This change would make it so that Cloudflare&#x2019;s configuration files do not propagate immediately to the full network, as they still do now. But making <em>all</em> global configuration files have staged rollouts is a large implementation that could take months. Evidently, there wasn&#x2019;t time to make it yet, and it has come back to bite Cloudflare.</p><p>Unfortunately for Cloudflare, customers are likely to find unacceptable a second outage with similar causes to a previous one, only weeks ago. If Cloudflare proves unreliable, customers should plan to onboard to <em>backup</em> CDNs at the very least, and a backup CDN vendor will do its best to convince new customers to use it as the primary CDN.</p><p>Cloudflare&#x2019;s value-add rests on rock-solid reliability without customers needing to budget for a backup CDN. Yes, publishing postmortems on the same day as an outage occurs helps restore trust, but that will crumble anyway with repeated large outages.</p><p><strong>To be fair, the company is doubling down on implementing staged configuration rollouts. </strong>In its postmortem, Cloudflare is its own biggest critic. CTO Dane Knecht <a href="https://blog.cloudflare.com/5-december-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reflected</a>:</p><blockquote>&#x201C;[Global configuration changes rolling out globally] remains our first priority across the organization. In particular, the projects outlined below should help contain the impact of these kinds of changes:<strong>Enhanced Rollouts &amp; Versioning:</strong> Similar to how we slowly deploy software with strict health validation, data used for rapid threat response and general configuration needs to have the same safety and blast mitigation features. This includes health validation and quick rollback capabilities among other things.<strong>Streamlined break glass capabilities: </strong>Ensure that critical operations can still be achieved in the face of additional types of failures. This applies to internal services as well as all standard methods of interaction with the Cloudflare control plane used by all Cloudflare customers.<strong>&#x201C;Fail-Open&#x201D; Error Handling: </strong>As part of the resilience effort, we are replacing the incorrectly applied hard-fail logic across all critical Cloudflare data-plane components. If a configuration file is corrupt or out-of-range (e.g., exceeding feature caps), the system will log the error and default to a known-good state or pass traffic without scoring, rather than dropping requests. Some services will likely give the customer the option to fail open or closed in certain scenarios. This will include drift-prevention capabilities to ensure this is enforced continuously.<br>These kinds of incidents, and how closely they are clustered together, are not acceptable for a network like ours&#x201D;.</blockquote><h3 id="global-configuration-errors-often-trigger-large-outages">Global configuration errors often trigger large outages</h3><p>There&#x2019;s a pattern of implicit or explicit global configuration errors causing large outages, and some of the biggest ones in recent years were caused by a single change being rolled out to a whole network of machines:</p><ul><li><strong>DNS and DNS-related systems like BGP:</strong> DNS changes are global by default, so it&#x2019;s no wonder that DNS changes can cause global outages. Meta&#x2019;s <a href="https://en.wikipedia.org/wiki/2021_Facebook_outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">7-hour outage in 2021</a> was related to DNS changes (more specifically, Border Gateway Protocol changes.) Meanwhile, the AWS outage in October <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">started with</a> the internal DNS system.</li><li><strong>OS updates happening at the same time, globally: </strong>Datadog&#x2019;s <a href="https://newsletter.pragmaticengineer.com/p/inside-the-datadog-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">2023 outage</a> cost the company $5M and was caused by Datadog&#x2019;s Ubuntu machines executing an OS update within the same time window, globally. It caused issues with networking, and it didn&#x2019;t help that Datadog ran its infra on 3 different cloud providers across 3 networks. The same kind of Ubuntu update also <a href="https://newsletter.pragmaticengineer.com/p/why-reliability-is-hard-at-scale?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">caused a global outage</a> for Heroku in 2024.</li></ul><p><strong>Globally replicating configs: </strong><a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">in 2024</a>, a configuration policy change was rolled out globally and crashed every Spanner database node straight away. As Google concluded in <a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">its postmortem</a>: &#x201C;Given the global nature of quota management, this metadata was replicated globally within seconds&#x201D;.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/12/image.png" class="kg-image" alt loading="lazy" width="1456" height="970" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/12/image.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/12/image.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/12/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Step 2 &#x2013; replicating a configuration file globally across GCP &#x2013; </em></i><a href="https://newsletter.pragmaticengineer.com/i/168964142/google-cloud-globally-replicating-a-config-triggers-worldwide-outage?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">caused a global outage</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> in 2024</em></i></figcaption></figure><p>Implementing gradual rollouts for <em>all</em> configuration files is a <em>lot</em> of work. It&#x2019;s also invisible labor because when done well, then its benefits will be undetectable, except in the absence of incidents, thanks to better infrastructure!</p><p><strong>The largest systems in the world will likely have to implement safer ways to roll out configs &#x2013; but not everybody needs to. </strong>Staged configuration rollout doesn&#x2019;t make much sense for smaller companies and products because this infra work slows down product development.</p><p>It doesn&#x2019;t just slow down building, but every deployment, too, and this friction is designed to make everything slower. As such, they don&#x2019;t make much sense unless the stability of mature systems is more important than fast iteration.</p><p>Software engineering is a field where tradeoffs are a fact of life, and universal solutions don&#x2019;t exist. The development which worked for a system with 1/100th of the load and users a year ago, may not make sense today.</p><p><em>This was one out of the four topics covered in this week&#x2019;s The Pulse. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>The full edition</em></a><em> additionally covers:</em></p><ol><li><strong>Industry Pulse.&#xA0;</strong>Poor capacity planning at AWS, Meta moves to a &#x201C;closed AI&#x201D; approach, a looming RAM shortage, early-stage startups hiring slower than before, how long it takes to earn $600K at Amazon and Meta, Apple loses execs to Meta, and more</li><li><strong>How the engineering team at Oxide uses LLMs.&#xA0;</strong>They find LLMs great for reading documents and lightweight research, mixed for coding and code review, and a poor choice for writing documents &#x2013; or any kind of writing, really!</li><li><strong>Linux officially supports Rust in the kernel.&#xA0;</strong>Rust is now a first-class language inside the Linux kernel, eight months after a Linux Foundation Fellow&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/how-linux-is-built-with-greg-kroah?ref=blog.pragmaticengineer.com">predicted</a>&#xA0;more support for Rust. A summary of the pros and cons of Rust support for Linux</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-156?ref=blog.pragmaticengineer.com" rel="noreferrer"><strong>Read the full The Pulse issue</strong></a><strong>.</strong></p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Could a 5-day RTO be around the corner for Big Tech?]]></title><description><![CDATA[From next February, workers at Instagram must be in the office, five days a week. This makes Meta the second tech giant after Amazon to mandate a 5-day RTO. Will more big companies do the same?]]></description><link>https://blog.pragmaticengineer.com/the-pulse-could-a-5-day-rto-be-around-the-corner-for-big-tech/</link><guid isPermaLink="false">693b1247dd0e8a0001c79f46</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Sat, 13 Dec 2025 15:21:25 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-155?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p>A year ago, Amazon became the first tech giant to bring staff back into the office for the full five days per week. Back then, I&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/149104874/what-does-amazons-day-rto-mean-for-tech?ref=blog.pragmaticengineer.com">analyzed</a>&#xA0;the reasons for the change, and whether other workplaces would follow suit by dropping the widespread hybrid policy of 2-3 days/week in the office.</p><p>Now, Meta employees in the Instagram division have become the latest subjects of a full return to the office, following an announcement by the social media platform this week.</p><h3 id="instagram%E2%80%99s-5-day-return-to-office">Instagram&#x2019;s 5-day return to office</h3><p>Instagram employees&#xA0;<a href="https://sources.news/p/instagrams-return-to-office-mandate?ref=blog.pragmaticengineer.com">received the unexpected email on Monday</a>, reports fellow Substacker, Alex Heath, who acquired a copy of the message. It was sent internally by Instagram CEO Adam Mosseri, who wrote:</p><blockquote>&#x201C;<strong>1. Back to the office:</strong>&#xA0;I believe that we are more creative and collaborative when we are together in-person. (...)<br><br><strong>2. Fewer meetings:</strong>&#xA0;We all spend too much time in meetings that are not effective, and it&#x2019;s slowing us down. Every six months, we&#x2019;ll cancel all recurring meetings and only re-add the ones that are absolutely necessary (...)<br><br><strong>3. More demos, less decks:</strong>&#xA0;Most product overviews should be prototypes instead of decks.<br><br><strong>4. Faster decision-making:</strong>&#xA0;We&#x2019;re going to have a more formalized unblocking process with DRIs, and I&#x2019;ll be at the priorities progress unblocking meeting every week.&#x201D;</blockquote><p>This decision by Meta affects around a quarter of company staff, and it&#x2019;s hard to imagine other divisions not following Instagram&#x2019;s lead; after all, everything in Mosseri&#x2019;s memo likely applies across the business.</p><p>Five years ago, CEO Mark Zuckerberg predicted 50% of Meta staff would work remotely by now, which didn&#x2019;t happen. Indeed, with Instagram&#x2019;s new 5-day RTO, I&#x2019;d be surprised if 5% of Meta folks work remotely in two years&#x2019; time.</p><p><strong>The reason for Insta&#x2019;s RTO seems rooted in the leadership&#x2019;s belief that in-office is more productive,&#xA0;</strong>as indicated by the top bullet point of Mosseri&#x2019;s message. That message in full:</p><p>&#x201C;I believe that we are more creative and collaborative when we are together in-person. I felt this pre-COVID and I feel it any time I go to our New York office where the in-person culture is strong.</p><p>Starting February 2, I&#x2019;m asking everyone in my rollup based in a US office with assigned desks to come back full time (five days a week). The specifics:</p><ul><li>You&#x2019;ll still have the flexibility to work from home when you need to, since I recognize there will be times you won&#x2019;t be able to come into the office. I trust you all to use your best judgment in figuring out how to adapt to this schedule.</li><li>In the NY office, we won&#x2019;t expect you to come back full time until we&#x2019;ve alleviated the space constraints. We&#x2019;ll share more once we have a better sense of timeline.</li><li>In MPK [Menlo Park, the HQ], we&#x2019;ll move from MPK21 to MPK22 on January 26 so everyone has an assigned desk. We&#x2019;re also offering the option to transfer from the MPK to SF office for those people whose commute would be the same or better with that change. We&#x2019;ll reach out directly to those people with more info.</li><li>XFN [cross-functional] partners will continue to follow their own org norms.</li><li>There is no change for employees who are currently remote&#x201D;.</li></ul><p>From what I&#x2019;ve seen of Mosseri from afar, he seems like a pretty straight shooter. It&#x2019;s clear that he feels in-office creates more energy, and in Mosseri&#x2019;s defense, I hear similar from many startup founders and leaders who say remote work causes a bunch of headaches: it&#x2019;s harder to spot motivational problems and performance issues, information travels more slowly, and rallying teams is harder.</p><p><strong>There&#x2019;s no doubt that running a full-remote company is a lot of effort.&#xA0;</strong>There&#x2019;s often-overlooked labor involved in hiring, onboarding, performance management, team celebrations, and even company-wide meetings &#x2013; none of it is easy.</p><p>Linear is a full-remote company with nearly 50 people working there, which&#xA0;<a href="https://linear.app/now/designing-remote-work-at-linear?ref=blog.pragmaticengineer.com">recently published details about how it operates</a>. They&#x2019;re introducing the concept of &#x201C;coworking hubs&#x201D;, flying in teams for in-person events, and holding regular off-sites, while being careful to hire people who fit the culture.</p><p><strong>My feeling is that remote work policies at tech companies are going to become questions of their leaders&#x2019; preferences.&#xA0;</strong>Many devs prefer remote work: there&#x2019;s fewer interruptions, more deep focus, and less commuting. Most of us would probably be just as productive &#x2013; and probably more so &#x2013; than when being interrupted in-office.</p><p>Leaders who prefer full-remote can cite flexibility and easier hiring from a larger pool of candidates as clear benefits. Meanwhile, those most comfortable with in-person will always have enough reasons to justify a 5-day RTO, along the lines of Mosseri&#x2019;s reasoning. Advocates of hybrid setups cite balancing of focus time and efficiency.</p><p>In today&#x2019;s job market, any company that pays closer to the top of the market can probably get away with five-days-a-week RTO. Meta is in this space, and although I&#x2019;m sure plenty of devs will dislike the change, the alternative is to go out on the job market, accept a pay cut to join a new company, and start rebuilding your internal network.</p><p>Since we&#x2019;re in the&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025?ref=blog.pragmaticengineer.com">midst of a weird job market</a>, it makes switching jobs more difficult than before, when the job market was very hot. In this respect, Instagram has external conditions on its side. For devs at Meta, one upside is that Big Tech experience&#xA0;<a href="https://newsletter.pragmaticengineer.com/p/tech-jobs-market-2025-part-3?ref=blog.pragmaticengineer.com">opens more doors</a>, even in this tough job market.</p><p>One caveat is that a 5-day RTO is unlikely in places where it&#x2019;s hard to hire the right people. So, AI engineers and those working on AI products should be pretty safe, for instance, because those roles are&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/172584839/ai-engineering-trends?ref=blog.pragmaticengineer.com">incredibly in-demand</a>, as indicated by the&#xA0;<a href="https://newsletter.pragmaticengineer.com/i/165280420/new-trend-higher-base-salaries-for-ai-engineers?ref=blog.pragmaticengineer.com">trend of higher base salaries for AI engineers</a>. Based on that, few companies should want to push those workers to quit to join competitors.</p><p></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/" rel="noopener noreferrer nofollow"><em> an email you could send to your manager</em></a><em>.</em></p>]]></content:encoded></item><item><title><![CDATA[Downdetector and the real cost of no upstream dependencies]]></title><description><![CDATA[During the Cloudflare outage, Downdetector was also unavailable. I got details from the team about why they have a hard dependency on Cloudflare, and why that won’t change anytime soon.]]></description><link>https://blog.pragmaticengineer.com/downdetector-and-the-real-cost-of-no-upstream-dependencies/</link><guid isPermaLink="false">6932a20b097ffa00013da35c</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 05 Dec 2025 09:14:50 GMT</pubDate><content:encoded><![CDATA[<p><em>The below is one out of five topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>The Pulse #154.</em></a><em> Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em><u>subscribe here</u></em></a><em>.</em></p><p><em>Many subscribers expense The Pragmatic Engineer Newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em><u> an email you could send to your manager</u></em></a><em>.</em></p><hr><p>One amusing detail of the <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noreferrer">November 2025 Cloudflare outage</a> is that the realtime outage and monitoring service, Downdetector, went down, revealing a key dependency on Cloudflare. At first, this looks odd; after all, Downdetector is about monitoring uptime, so why would it take on a key dependency like Cloudflare if it means this can happen?</p><p><strong>Downdetector was built multi-region and multi-cloud,</strong>&#xA0;which<strong>&#xA0;</strong>I confirmed by talking with Senior Director of Engineering,&#xA0;<a href="https://x.com/damndhruv?ref=blog.pragmaticengineer.com">Dhruv Arora</a>, at Ookla, the company behind Downdetector. Multi-cloud resilience makes little sense for most products, but Downdetector was built to detect cloud provider outages, as well. And for this, they needed to be multi-cloud!</p><p>Still, Downdetector uses Cloudflare for DNS, Content Delivery (CDN), and Bot Protection. So, why would it take on this one key dependency, as opposed to hosting everything on its own servers?</p><p><strong>A CDN has advantages that are hard to ignore,&#xA0;</strong>such as:</p><ul><li>Drastically lower bandwidth costs &#x2013; assets cached on the CDN are much faster</li><li>Faster load times because assets on a CDN are served from Edge nodes nearer users</li><li>Protection from sudden traffic spikes, as would be common for Downdetector, especially during outages! Without a CDN, those spikes could overload their services</li><li>DDoS protection from bad actors taking the site offline with a distributed denial of service attack</li><li>Reduced infrastructure requirements, as Downdetector can run on fewer servers</li></ul><p>Downdetector&#x2019;s usage patterns reflect that it&#x2019;s a service very heavily used by consumers whom the business doesn&#x2019;t really monetize (Downdetector is free to use.) So, Downdetector could get rid of Cloudflare, but costs would surge, the site would become slower to load, and revenue wouldn&#x2019;t change.</p><p>In the end, Downdetector&#x2019;s dependence on Cloudflare could be a pragmatic choice based on the business model, and how removing its upstream dependency upon Cloudflare could get very expensive!</p><p>Dhruv confirmed this and sharing more about the design choices at Downdetector:</p><blockquote>&#x201C;<strong>Building redundancy at the DNS &amp; CDN layers would require enormous overhead.</strong>&#xA0;This is especially true as Cloudflare&#x2019;s Bot Protection is world-class, and building similar functionality would be a lot of effort. There are hyperscalers [cloud providers] that have this kind of redundancy built in. We will look into what we can do, but with a team size in the double digits, building up a core piece of infra like this is a pretty tall order: not just for us, but for any mid-sized team.<br><br>We&#x2019;ve learned that there are more things that we can improve, for the future. For example, during the outage, the Cloudflare control pane was down, but their API wasn&#x2019;t. So, us having more Infrastructure as Code could have helped bring back Downdetector sooner.<br><br>On our end, we also noticed that the outage wasn&#x2019;t global, so we were able to shift traffic around and reduce the impact.<br><br>One more interesting detail: Cloudflare&#x2019;s Bot Protection went haywire during the outage, and started to block legitimate traffic. So, our team had to turn that off temporarily&#x201D;.</blockquote><p>Thanks very much to Dhruv and the Downdetector team for sharing details.</p>]]></content:encoded></item><item><title><![CDATA[A startup in Mongolia translated my book]]></title><description><![CDATA[A 30-person startup called Nasha Tech translated The Software Engineer's Guidebook for the benefit of their company and the Mongolian tech ecosystem.]]></description><link>https://blog.pragmaticengineer.com/traveling-to-mongolia/</link><guid isPermaLink="false">69206cafc3b7150001d419bf</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Fri, 21 Nov 2025 13:47:17 GMT</pubDate><content:encoded><![CDATA[<p>I published <a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer">The Software Engineer&apos;s Guidebook</a> two years ago. <em> I shared more details on how I self-published the book, and the learnings from publishing </em><a href="https://newsletter.pragmaticengineer.com/p/the-software-engineers-guidebook?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>in this post.</em></a></p><p>An unexpected highlight of publishing the book was ending up in Mongolia in June of this year, at a small-but-mighty startup called <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech</a>. This was because the startup translated my book into Mongolian. Here&apos;s the completed book:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/Screenshot-2025-11-21-at-15.34.01.png" class="kg-image" alt loading="lazy" width="1078" height="1292" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/Screenshot-2025-11-21-at-15.34.01.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/Screenshot-2025-11-21-at-15.34.01.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/Screenshot-2025-11-21-at-15.34.01.png 1078w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Software Engineer&apos;s Guidebook, in Mongolian. You can </span><a href="https://internom.mn/%D0%B1%D0%B0%D1%80%D0%B0%D0%B0/9789919053185-%D1%81%D0%BE%D1%84%D1%82%D0%B2%D1%8D%D0%B9%D1%80-%D0%B8%D0%BD%D0%B6%D0%B5%D0%BD%D0%B5%D1%80%D0%B8%D0%B9%D0%BD-%D1%85%D3%A9%D1%82%D3%A9%D1%87-%D0%BD%D0%BE%D0%BC?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;">buy this translation here</span></a></figcaption></figure><p>Here&#x2019;s what happened:</p><p>A little over a year ago, a small startup from Mongolia reached out, asking if they could translate the book. I was skeptical it would happen because the unit economics appeared pretty unfavorable. Mongolia&#x2019;s population is 3.5 million; much smaller than other countries where professional publishers had offered to do a translation (Taiwan: 23M, South Korea: 51M, Germany: 84M, Japan: 122M, China: 1.43B people).</p><p>But I agreed to the initiative, and expected to hear nothing back. To my surprise, nine months later the translation was ready, and the startup printed 500 copies on the first run. They invited me to a book signing in the capital city of Ulaanbaatar, and soon I was on my way to meet the team, and to understand why a small tech company translated my book!</p><h3 id="japanese-startup-vibes-in-mongolia">Japanese startup vibes in Mongolia</h3><p>The startup behind the translation is called <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech</a>; a mix of a startup and a digital agency. Founded in 2018, its main business has been agency work, mainly for companies in Japan. They are a group of 30 people, mostly software engineers.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/image-1.png" class="kg-image" alt loading="lazy" width="1086" height="1264" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/image-1.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/image-1.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/image-1.png 1086w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Nasha Tech&#x2019;s offices in Ulaanbaatar, Mongolia</span></figcaption></figure><p>Their offices resembled a mansion more than a typical workplace, and everyone takes their shoes off when arriving at work and switches to &#x201C;office slippers&#x201D;. I encountered the same vibe later <a href="https://newsletter.pragmaticengineer.com/i/177384640/cursor-push-for-release?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">at Cursor&#x2019;s headquarters in San Francisco</a>, in the US.</p><p>Nasha Tech found a niche of working for Japanese companies thanks to one of its cofounders studying in Japan, and building up connections while there. Interestingly, another cofounder later moved to Silicon Valley, and advises the company from afar.</p><p><strong>The business builds the &#x201C;Uber Eats of Mongolia&#x201D;. </strong>Outside of working as an agency, Nasha Tech builds its own products. The most notable is called TokTok, the &#x201C;UberEats of Mongolia&#x201D;, which is the leading food delivery app in the capital city. The only difference between TokTok and other food delivery apps is scale: the local market is smaller than in some other cities. At a few thousand orders per day, it might not be worthwhile for an international player like Uber or Deliveroo to enter the market.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/image-2.png" class="kg-image" alt loading="lazy" width="1456" height="646" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/image-2.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/image-2.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/image-2.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The </em></i><a href="https://www.toktok.mn/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">TokTok</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> app: a customer base of 800K, 500 restaurants, and 400 delivery riders</em></i></figcaption></figure><p>The tech stack Nasha Tech typically uses:</p><ul><li>Frontend: React / Next, Vue / Nuxt, TypeScript, Electron, Tailwind, Element UI</li><li>Backend and API: NodeJS (Express, Hono, Deno, NestJS), Python (FastAPI, Flask), Ruby on Rails, PHP (Laravel), GraphQL, Socket, Recoil</li><li>Mobile: Flutter, React Native, Fastlane</li><li>Infra: AWS, GCP, Docker, Kubernetes, Terraform</li><li>AI &amp; ML: GCP Vertex, AWS Bedrock, Elasticsearch, LangChain, Langfuse</li></ul><p>AI tools are very much widespread, and today the team uses Cursor, GitHub Copilot, Claude Code, OpenAI Codex, and Junie by Jetbrains.</p><p><strong>I detected very few differences between Nasha Tech and other &#x201C;typical&#x201D; startups I&#x2019;ve visited, in terms of the vibe and tech stack. </strong>Devs working on TokTok were very passionate about how to improve the app and reduce the tech debt accumulated by prioritizing the launch. A difference for me was the language and target market: the main language in the office is, obviously, Mongolian, and the products they build like TokTok also target the Mongolian market, or the Japanese one when working with clients.</p><p>One thing I learned was that awareness about the latest tools has no borders: back in June, a dev at Nasha Tech was already telling me that Claude Code was their daily driver, even though the tool had been released for barely a month at that point!</p><h3 id="why-translate-the-book-into-mongolian">Why translate the book into Mongolian?</h3><p>Nasha Tech was the only non-book publisher to express interest in translating the book. But why did they do it?</p><p>I was told the idea came from software engineer <a href="https://x.com/ssuuribaatar?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Suuribaatar Sainjargal</a>, who bought and enjoyed the English-language version. He <a href="https://x.com/GergelyOrosz/status/1937160382600343964?s=20&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">suggested</a> translating the book so that everyone at the company could read it, not only those fluent in English.</p><p>Nasha Tech actually had some in-house experience of translation. A year earlier, in 2024, the company translated Matt Mochary&#x2019;s <a href="https://www.amazon.com/Great-CEO-Within-Tactical-Building-ebook/dp/B07ZLGQZYC?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Great CEO Within</a> as a way to uplevel their leadership team, and to help the broader Mongolian tech ecosystem.</p><p>Also, the company&#x2019;s General Manager, <a href="https://www.linkedin.com/in/battsengel/?originalSubdomain=mn&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Batutsengel Davaa</a>, happened to have been involved in translating more than 10 books in a previous role. He took the lead in organizing this work, and here&#x2019;s how the timelines played out:</p><ul><li>Professional translator: 3 months</li><li>Technical editor revising the draft translation: 1 month</li><li>Technical editing #2 by a Support Engineer in Japan: 2 months</li><li>Technical revision: 15 engineers at Nasha Tech revised the book, with a &#x201C;divide and conquer&#x201D; approach: 2 months</li><li>Final edit and print: 1 month</li></ul><p>This was a real team effort. Somehow, this startup managed to produce a high-quality translation in around the same time as it took professional book publishers in my part of the world to do the same!</p><p>A secondary goal that Nasha Tech had was to advance the tech ecosystem in Mongolia. There&#x2019;s understandably high demand for books in the mother tongue; I observed a number of book stands selling these books, and book fairs are also popular. The translation of my book has been selling well, where you can <a href="https://internom.mn/%D0%B1%D0%B0%D1%80%D0%B0%D0%B0/9789919053185-%D1%81%D0%BE%D1%84%D1%82%D0%B2%D1%8D%D0%B9%D1%80-%D0%B8%D0%BD%D0%B6%D0%B5%D0%BD%D0%B5%D1%80%D0%B8%D0%B9%D0%BD-%D1%85%D3%A9%D1%82%D3%A9%D1%87-%D0%BD%D0%BE%D0%BC?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">buy the book</a> for 70,000 MNTs (~$19).</p><h3 id="book-signing-and-the-mongolian-startup-scene">Book signing and the Mongolian startup scene</h3><p>The book launch event was at Mongolia&#x2019;s startup hub, called <a href="https://digitalnomad.itpark.mn/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">IT Park</a>, which offers space for startups to operate in. I met a few working in the AI and fintech spaces &#x2013; and even one startup producing comics.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/image-3.png" class="kg-image" alt loading="lazy" width="1378" height="1184" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/image-3.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/image-3.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/image-3.png 1378w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Book launch event, and meeting startups inside Mongolia&#x2019;s IT Park</span></figcaption></figure><p>I had the impression that the government and private sector are investing heavily in startups, and want to help more companies to become breakout success stories:</p><ul><li><a href="https://digitalnomad.itpark.mn/ds_in_mongolia?ref=blog.pragmaticengineer.com#ds" rel="noopener noreferrer nofollow">IT Park report</a>: the country&#x2019;s tech sector is growing ~20%, year-on-year. The <em>combined</em> valuation of all startups in Mongolia is at $130M, today.<em> It&#x2019;s worth remembering that location is important for startups: being in hubs like the US, UK, and India confers advantages that can be reflected in valuations.</em></li><li><a href="https://www.jica.go.jp/overseas/mongolia/sjp04ove1698/__icsFiles/afieldfile/2024/08/28/Summary.pdf?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Mongolian Startup Ecosystem Report 2023</a>: the average pre-seed valuation of a startup in Mongolia is $170K, seed valuation at $330K, and Series A valuation at $870K. The numbers reflect market size; for savvy investors, this could also be an opportunity to invest early. I met a Staff Software Engineer at the book signing event who is working in Silicon Valley at Google, and invests and advises in startups in Mongolia.</li><li><a href="https://drive.google.com/file/d/1Ath-eOMd4Kr924cq1AkgLekfeJlXCBfd/view?usp=sharing&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Mongolian startup ecosystem Map</a>: better-known startups in the country.</li></ul><p>Two promising startups from Mongolia: <a href="https://chimege.com/en/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Chimege</a> (an AI+voice startup) <a href="https://and.global/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">AND Global</a> (fintech). Thanks very much to the <a href="https://nashatech.com/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Nasha Tech team</a> for translating the book &#x2013; keep up the great work!</p><h2 id></h2>]]></content:encoded></item><item><title><![CDATA[The Pulse: Cloudflare takes down half the internet – but shares a great postmortem]]></title><description><![CDATA[A database permissions change ended up knocking Cloudflare’s proxy offline. Pinpointing the root cause was tricky – but Cloudflare shared a detailed postmortem. Also: announcing The Pragmatic Summit]]></description><link>https://blog.pragmaticengineer.com/the-pulse-cloudflare-takes-down-half-the-internet/</link><guid isPermaLink="false">691f7b63e9904f00015006db</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 20 Nov 2025 20:36:19 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of five topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com"><em>this week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em>subscribe here</em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em> an email you could send to your manager</em></a><em>.</em></p><hr><p>Before we start: I&#x2019;m excited to share something new: <strong>The Pragmatic Summit.</strong></p><p>Four years ago, The Pragmatic Engineer started as a small newsletter: me writing about topics relevant for engineers and engineering leaders at Big Tech and startups. Fast forward to today, and the newsletter <a href="https://newsletter.pragmaticengineer.com/p/one-million?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">crossed one million readers</a>, and the publication expanded with <a href="https://newsletter.pragmaticengineer.com/podcast?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">a podcast</a> as well.</p><p>One thing that was always missing: meeting in person. Engineers, leaders, founders&#x2014;people who want to meet others in this community, and learn from each other. Until now that is:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/TPS_Social_RegLive_1200x627_110625.png" class="kg-image" alt loading="lazy" width="1200" height="627" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/TPS_Social_RegLive_1200x627_110625.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/TPS_Social_RegLive_1200x627_110625.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/TPS_Social_RegLive_1200x627_110625.png 1200w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Pragmatic Summit. </span><a href="https://www.pragmaticsummit.com/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><span style="white-space: pre-wrap;">See more details and apply to attend</span></a></figcaption></figure><p>In partnership with <a href="http://statsig.com/pragmatic?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Statsig</a>, I&#x2019;m hosting the first-ever <a href="https://www.pragmaticsummit.com/?utm_source=the-pragmatic-engineer&amp;utm_medium=newsletter&amp;utm_campaign=nov-20-paid-edition" rel="noopener noreferrer nofollow"><strong>Pragmatic Summit</strong></a>. Seats are limited, and tickets are priced at $499, covering the venue, meals, and production&#x2014;we&#x2019;re not aiming to make any profit from this event.</p><p><a href="https://www.pragmaticsummit.com/?ref=blog.pragmaticengineer.com">Apply to attend the Summit</a></p><p>I hope to see many of you there!</p><hr><h2 id="cloudflare-takes-down-half-the-internet-%E2%80%93-but-shares-a-great-postmortem">Cloudflare takes down half the internet &#x2013; but shares a great postmortem</h2><p>On Tuesday came another reminder about how much of the internet depends on Cloudflare&#x2019;s content delivery network (CDN), when thousands of sites went fully or partially offline in an outage that lasted 6 hours. Some of the higher-profile victims included:</p><ul><li>ChatGPT and Claude</li><li>Canva, Dropbox, Spotify,</li><li>Uber, Coinbase, Zoom</li><li>X and Reddit</li></ul><p>Separately, you may or may not recall that during a different recent outage caused by AWS, Elon Musk noted on his website, X, that AWS is a hard dependency for Signal, meaning an AWS outage could take down the secure messaging service at any moment. In response, a dev pointed out that it is the same for X with Cloudflare &#x2013; and so it proved earlier this week, when X was broken by the Cloudflare outage.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!IN2n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a9cfc94-1792-4a5e-8fb6-c1815df54ff0_1072x898.png" class="kg-image" alt loading="lazy" width="1072" height="898"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Predicting the future. Source: Mehul Mohan </em></i><a href="https://x.com/mehulmpt/status/1980382080602370144?s=20&amp;ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">on X</em></i></a></figcaption></figure><p>That AWS outage was in the company&#x2019;s us-east-1 region and <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">took down a good part of the internet</a> last month. AWS released incident details three days later &#x2013; unusually speedy for the e-commerce giant &#x2013; although that postmortem was high-level and we never learned <em>exactly</em> what caused AWS&#x2019;s <a href="https://newsletter.pragmaticengineer.com/i/176934094/how-dynamodb-dns-management-happens?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">DNS Enactor</a> service to slow down, triggering an unexpected race condition that kicked off the outage.</p><h3 id="what-happened-this-time-with-cloudflare">What happened this time with Cloudflare?</h3><p>Within hours of mitigating the outage, Cloudflare&#x2019;s CEO Matthew Prince shared an <a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">unusually detailed report </a>of what exactly went wrong. The root cause was to do with propagating a configuration file to Cloudflare&#x2019;s Bot Management module. The file crashed Bot Management, which took Cloudflare&#x2019;s proxy functionality offline.</p><p>Here&#x2019;s a brief overview of how Cloudflare&#x2019;s proxy layer works at a high level. It&#x2019;s the layer that protects the &#x201C;origin&#x201D; resources of customers &#x2013; minimizing network traffic to them by blocking malicious requests and caching static resources in Cloudflare&#x2019;s CDN:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!esOT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F132ad7a8-2c1d-4be1-8174-295941979ceb_1420x1312.png" class="kg-image" alt loading="lazy" width="1420" height="1312"><figcaption><i><em class="italic" style="white-space: pre-wrap;">How Cloudflare&#x2019;s proxy works. More details on </em></i><a href="https://blog.cloudflare.com/20-percent-internet-upgrade/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare&#x2019;s engineering blog</em></i></a></figcaption></figure><p>Here&#x2019;s how the incident unfolded:</p><p><strong>A database permissions change in </strong><a href="https://en.wikipedia.org/wiki/ClickHouse?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>ClickHouse</strong></a><strong> kicked things off. </strong>Before the permissions changed, all queries to fetch feature metadata (to be used by the Bot Management module) would have only been run on distributed tables in Clickhouse, in a database called &#x201C;default&#x201D; which contains 60 features.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!NEwO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f62c0a-5772-45a3-9be1-24e7c15c4e7b_1264x264.png" class="kg-image" alt loading="lazy" width="1264" height="264"><figcaption><span style="white-space: pre-wrap;">Before the permissions change: about 60 features were returned, that were fed to the Bot Module</span></figcaption></figure><p>Until now, these queries were running using a shared system account. Cloudflare&#x2019;s engineering team wanted to improve system security and reliability, and move from this shared system account to individual user accounts. User accounts already had access to another database called &#x201C;r0&#x201D;, so the team made the database permission change for access to r0 to be <em>implicit</em> instead of explicit.</p><p>As a side effect of this, the same query collecting the features to be passed to Bot Management started to fetch from the r0 database, and return many more features than expected:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!p5bm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e62f91e-7078-4b9d-8e2f-3b3fb357aef5_1220x252.png" class="kg-image" alt loading="lazy" width="1220" height="252"><figcaption><span style="white-space: pre-wrap;">After the permissions change: the query did not change but returned twice as many results</span></figcaption></figure><p><strong>The Bot Management module does not allow loading of more than 200 features. </strong>This limit was well above the production usage of 60, and was put in place for performance reasons: the Bot Management module pre-allocates memory for up to 200 features, and it will not operate with more than this number.</p><p><strong>A </strong><a href="https://en.wikipedia.org/wiki/Kernel_panic?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>system panic</strong></a><strong> hit machines served with the incorrect feature file. </strong>Cloudflare was nice enough to share the exact code that caused this panic, which was this unwrap() function:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!qih4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8462b639-2c4c-4c8d-91b2-a468f97d7ee4_1606x666.png" class="kg-image" alt loading="lazy" width="1456" height="604"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Source: </em></i><a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare</em></i></a></figcaption></figure><p>What likely happened:</p><ul><li>The append_with_names() function likely checked for a limit of 200 features</li><li>If it saw more than 200 features, it likely returned an error</li><li>&#x2026; and when writing the code, it was not expected that append_with_names() would return an error&#x2026;</li><li>&#x2026; and so .unwrap() panicked and crashed the system!</li></ul><p><strong>Edge nodes started to crash, one by one, seemingly randomly. </strong>The feature file was being generated every 5 minutes, and gradually rolled out to Edge nodes. So, initially, it was only a few nodes that crashed, and then over time, more became non-responsive. At one point, both good and bad configuration files were being distributed, making failed nodes that received the good configuration file start working &#x2013; for a while!</p><h3 id="why-so-long-to-find-the-root-cause">Why so long to find the root cause?</h3><p>It took Cloudflare engineers unusually long &#x2013; 2.5 hours! &#x2013; to figure all this out, and that an incorrect configuration file propagating to Edge servers was to blame for their proxy going down. Turns out, an unrelated failure made the Cloudflare team suspect that they were under a coordinated botnet attack, as when a few of the Edge nodes started to go offline, the company&#x2019;s status page did, too:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!Xa8F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F565ff3fa-112f-4500-940a-4f3f241991fd_1999x478.png" class="kg-image" alt loading="lazy" width="1456" height="348"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cloudflare&#x2019;s status page went offline when the outage started. Source: </em></i><a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Cloudflare</em></i></a></figcaption></figure><p>The team tried to gather details about the attack, but there was no attack, meaning they wasted time looking in the wrong place. In reality, the status page going down was a coincidence and unrelated to the outage. But it&#x2019;s easy to see why their first reaction was to figure out if there was a distributed denial of service (DDoS) attack.</p><p>As mentioned, it eventually took 2.5 hours to pinpoint the incorrect configuration files as the source of the outage, and another hour to stop the propagation of new files, and create a new and correct file, which was deployed 3.5 hours after the start of the incident. Cleanup took another 2.5 hours, and at 17:06 UTC, the outage was resolved, ~6 hours after it started.</p><p>Cloudflare shared a detailed review of the incident and learnings, which can be <a href="https://blog.cloudflare.com/18-november-2025-outage/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">read here.</a></p><h3 id="how-did-the-postmortem-come-so-fast">How did the postmortem come so fast?</h3><p>One thing that keeps being surprising about Cloudflare is how they have a very detailed postmortem up in less than 24 hours after the incident is resolved. Cofounfer and CEO Matthew Prince <a href="https://news.ycombinator.com/user?id=eastdakota&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">explained</a> how this was possible:</p><ul><li>Matthew was part of the outage call.</li><li>After the outage was resolved, he wrote a first version of the incident review, at home. Matthew was in Lisbon, in Cloudflare&#x2019;s European HQ, so this was early evening</li><li>The team circulated a Google Doc with this initial writeup, and questions that needed to be reviewed</li><li>In a few hours, all questions were answered</li><li>Matthew: &#x201C;None of us were happy [about the incident] &#x2014; we were embarrassed by what had happened &#x2014; but we declared it [the postmortem] true and accurate.</li><li>Sent the draft over to the SF team, who did one more sweep, the posted it</li></ul><p>Talk about moving with the speed of a startup, despite being a publicly traded company!</p><h3 id="learnings">Learnings</h3><p>There is much to learn from this incident, such as:</p><p><strong>Be explicit about logging errors when you raise them! </strong>Cloudflare could probably have identified the root cause of this error much faster if the line of code that returned an error, also logged the error, and if Cloudflare had alerts set up when certain errors spiked on its nodes. It could have surely shaved an hour or two off the time it took to mitigate.</p><p>Of course, logging errors before throwing them is extra work, but when done with monitoring or log analysis, it can help find the source of errors much faster.</p><p><strong>Global database changes are always risky. </strong>You never know what part of the system you might hit.<strong> </strong>The incident started with a seemingly innocuous database permissions change that impacted a wide range of queries. Unfortunately, there is no good way to test the impact of such changes (if you know one, please leave a comment below!)</p><p>Cloudflare was making the right kind of change by removing global systems accounts; it&#x2019;s a good direction to go in for security and reliability. It was extremely hard to predict the change would end up taking down a part of their system &#x2013; and the web.</p><p><strong>Two things going wrong at the same time can really throw an engineering team. </strong>If Cloudflare&#x2019;s status page did not go offline, the engineering team would have surely pinpointed the problem much faster than they did. But in the heat of the moment, it&#x2019;s easy to assume that two small outages are connected, until there&#x2019;s evidence that they&#x2019;re not. Cloudflare is a service that&#x2019;s continuously under attack, so the engineering team can&#x2019;t be blamed for assuming it might be more of the same.</p><p><strong>CDNs are the backbone of the internet, and this outage doesn&#x2019;t change that. </strong>The outage hit lots of large businesses, resulting in lost revenue for many. But could affected companies have prepared better for Cloudflare going down?</p><p>The problem is that this is hard: using a CDN means taking on a <em>hard</em> dependency in order to reduce traffic on your own servers (the origin servers), while serving internet users faster and more cheaply:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!54wJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dca2f86-18b2-4ba8-8fd2-bc7236b330db_1194x280.png" class="kg-image" alt loading="lazy" width="1194" height="280"><figcaption><span style="white-space: pre-wrap;">A CDN is a common way to reduce traffic to servers and serve webpages and APIs faster to users</span></figcaption></figure><p>When using a CDN, you propagate addresses that point to that CDN server&#x2019;s IP or domain. When the CDN goes down, you could start to redirect traffic to your own origin servers (and deal with the traffic spike), or utilize a backup CDN, if you prepared for this eventuality.</p><figure class="kg-card kg-image-card"><img src="https://substackcdn.com/image/fetch/$s_!fj68!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80ef266a-4a28-429b-9d01-52a34e03eae0_1248x774.png" class="kg-image" alt loading="lazy" width="1248" height="774"></figure><p>Both these are expensive to pull off:</p><ul><li>Redirecting to the origin servers likely means needing to suddenly scale up backend infrastructure</li><li>Having a backup CDN means there must be a contract and payment for a CDN partner which will most likely sit idle. As and when it is needed, you must switch over and warm up their cache: it&#x2019;s a lot of effort and money to do this!</li></ul><p>A case study in the trickiness of dealing with a CDN going offline is the story of Downdetector, including inside details on why Downdetector went down during Cloudflare&#x2019;s latest outage, and what they learned from it.</p><hr><p><em>This was one out of the five topics covered in this week&#x2019;s The Pulse. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The full edition</em></a><em> additionally covers:</em></p><ol><li><strong>Downdetector &amp; the real cost of no upstream dependencies.</strong> During the Cloudflare outage, Downdetector was also unavailable. I got details from the team about why they have a hard dependency on Cloudflare, and why that won&#x2019;t change anytime soon.</li><li><strong>Antigravity: Google&#x2019;s new AI IDE &#x2013; that its devs cannot use. </strong>Google wants to become a serious player in AI coding tools, but Antigravity contains remnants of Windsurf. Interestingly, devs at Google aren&#x2019;t allowed to use Antigravity for work</li><li><strong>Industry pulse.</strong> Gemini 3 launch, Anthropic valued at $350B, Jeff Bezos funds an AI company, and unusually slow headcount growth at startups persists.</li><li><strong>Five AI fakers caught in 1 month by crypto startup. </strong>Candidates who fake their backgrounds and change their looks in remote interviews continue to plague companies hiring full-remote &#x2013; especially crypto startups.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-154?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Read the full The Pulse</strong></a></p>]]></content:encoded></item><item><title><![CDATA[Four years on writing a tech book: pitching to a publisher]]></title><description><![CDATA[<p>In 2019, I decided to write a book about software engineering. As an experienced software engineer and manager, I had the topic clear in my head, and assumed the whole project would take between six and 12 months in writing and publishing it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/image.png" class="kg-image" alt loading="lazy" width="1456" height="866" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/image.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/image.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The first proof copy of</span><a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;"> The Software</span></a></figcaption></figure>]]></description><link>https://blog.pragmaticengineer.com/four-years-on-writing-a-tech-book-pitching-to-a-publisher/</link><guid isPermaLink="false">69130a5abb6a4e00013466cc</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Tue, 11 Nov 2025 10:45:43 GMT</pubDate><content:encoded><![CDATA[<p>In 2019, I decided to write a book about software engineering. As an experienced software engineer and manager, I had the topic clear in my head, and assumed the whole project would take between six and 12 months in writing and publishing it.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/image.png" class="kg-image" alt loading="lazy" width="1456" height="866" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/image.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/image.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/image.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The first proof copy of</span><a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com" rel="noreferrer"><span style="white-space: pre-wrap;"> The Software Engineer&#x2019;s Guidebook</span></a><span style="white-space: pre-wrap;"> &#x2013; hence the &#x201C;not for resale&#x201D; markup</span></figcaption></figure><p>In the end, this process took several times longer; 4 years, in fact! Happily, it was worth it: readers&#x2019; feedback about <a href="https://www.engguidebook.com/?ref=blog.pragmaticengineer.com"><strong><u>The Software Engineer&#x2019;s Guidebook</u></strong></a> has been overwhelmingly positive, and on launch, the book became a <a href="https://twitter.com/GergelyOrosz/status/1723205530481729838?ref=blog.pragmaticengineer.com"><u>#1 bestseller</u></a> among all titles in two Amazon markets (the Netherlands and Poland), as well as a top 100-selling book in most Amazon markets. In 24 months it sold around 40,000 copies, and was translated into <a href="https://learning.oreilly.com/library/view/guidebook-fur-software/9783960092513/?ref=blog.pragmaticengineer.com"><u>German</u></a>,<a href="https://www.hanbit.co.kr/store/books/look.php?p_code=B2570473158&amp;ref=blog.pragmaticengineer.com"> <u>Korean</u></a>, <a href="https://x.com/GergelyOrosz/status/1936044091009036690?ref=blog.pragmaticengineer.com"><u>Mongolian</u></a> and<a href="https://x.com/GergelyOrosz/status/1973632590541365384?ref=blog.pragmaticengineer.com"> <u>Traditional Chines</u></a> &#x2013; with the Japanese and simplified Chinese versions releasing later this month.</p><p>A lot of people ask why I chose to self publish, and it would be nice to say this was always the goal, but it wasn&#x2019;t! Originally, I wanted to work with a top tech publisher, who would get the book to market fast, and give it a higher profile. This didn&#x2019;t happen, but during the process I learned a lot about how publishing works, how to pitch a book, and how to choose which publishing route might be the right one.&#xA0;</p><p>This article shares my learnings from writing and publishing a book which has done pretty well with readers, and it includes the experience working with an established publishing house:</p><ol><li>Tech book publishing landscape</li><li>Financials of publishing</li><li>Publishing process and the publisher&#x2019;s role</li><li>My book pitch</li><li>Working with a publisher</li><li>Breaking up with a publisher</li></ol><h2 id="1-tech-book-publishing-landscape">1. Tech book publishing landscape</h2><p>Today, there are reputable book publishers whose titles are good and authoritative, and there are other publishers whom this doesn&#x2019;t apply to. Each publisher also has a subject area: some are mainstream and publish titles about every software engineering area from languages to engineering management. Meanwhile, others stick to a topic of expertise they focus on.</p><p>Here&#x2019;s my mental model of the book publishing industry in 2025:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png" class="kg-image" alt loading="lazy" width="1600" height="1356" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/data-src-image-0145e3b2-dbc1-475d-93bb-160c7e3a3fbe.png 1600w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Biggest players in the tech book publishing industry, a subjective mental model of course!</em></i></figcaption></figure><h4 id="highly-reputable-mainstream-publishers">Highly reputable mainstream publishers</h4><p>In tech book publishing, three publishing houses really stand out, in my opinion, and form a &#x2018;big three&#x2019; among all players in this sector:&#xA0;</p><ul><li><a href="https://oreilly.com/?ref=blog.pragmaticengineer.com"><strong><u>O&#x2019;Reilly</u></strong></a>: if I had to pick a #1 tech book publisher, it would be O&#x2019;Reilly. They publish some of the most referenced books &#x2013; like Designing Data Intensive Applications by Martin Kleppmann, <a href="https://newsletter.pragmaticengineer.com/p/dead-code-getting-untangled-and-coupling?ref=blog.pragmaticengineer.com"><u>Tidy First</u></a> by Kent Beck, <a href="https://newsletter.pragmaticengineer.com/p/the-staff-engineers-path?ref=blog.pragmaticengineer.com"><u>The Staff Engineer&#x2019;s Path</u></a> by Tanya Reilly, and more. The book covers are distinctive, using images of animals.</li><li><a href="https://www.manning.com/?ref=blog.pragmaticengineer.com"><strong><u>Manning</u></strong></a>: a broad range of titles on both specific and general tech topics, which employ historical figures on the covers.</li><li><a href="https://pragprog.com/?ref=blog.pragmaticengineer.com"><strong><u>The Pragmatic Bookshelf</u></strong></a>: also referred to as the &#x201C;Prags.&#x201D; Founded by Andy Hunt and Dave Thomas, the authors of what might be the best-selling tech book ever; The Pragmatic Programmer. Since its founding, The Prags has refused digital rights management (DRM) on their ebooks.</li></ul><h4 id="high-reputable-%E2%80%9Cmainstream%E2%80%9D-publishers-that-are-tough-to-pitch-to">High reputable &#x201C;mainstream&#x201D; publishers that are tough to pitch to</h4><p>The publishers in this section have strong reputations, like those above. However, they are harder to pitch to, usually because they publish fewer tech books. I couldn&#x2019;t find an author pitch template, or clear pitching instructions, and contributes to a sense of &#x201C;don&#x2019;t find us, we&#x2019;ll find you&#x201D; among the following publishing houses:&#xA0;</p><ul><li><a href="https://en.wikipedia.org/wiki/Addison-Wesley?ref=blog.pragmaticengineer.com"><strong><u>Addison-Wesley:</u></strong></a> one of the best-known brands in tech. It has been an imprint (a trade name within a publication) of Pearson since 1988, and is the publisher of many &#x201C;classic&#x201D; book titles like Clean Code by Robert C. Martin, The Pragmatic Programmer by Andy Hunt and Dave Thomas, and some recent ones like Modern Software Engineering by Dave Farley. I couldn&#x2019;t find any way to pitch to this publisher, and new books they publish seem to be by established authors.</li><li><a href="https://www.pearson.com/?ref=blog.pragmaticengineer.com"><strong><u>Pearson</u></strong></a>: This business owns the Addison-Wesley imprint. Recently, it started to publish tech books as &#x201C;Pearson&#x201D; instead, author Martin Fowler <a href="https://twitter.com/martinfowler/status/1766836423808766003?ref=blog.pragmaticengineer.com"><u>shared</u></a>.</li><li><a href="https://www.wiley.com/en-us?ref=blog.pragmaticengineer.com"><strong><u>Wiley</u></strong></a>: formerly a well-known tech book publisher behind the &#x201C;X for Dummies&#x201D; series. It publishes lots of <a href="https://www.wiley.com/en-nl/etextbooks-and-courseware/computer-science-and-technology?ref=blog.pragmaticengineer.com"><u>computer science textbooks</u></a>, but I can&#x2019;t find recently-published, well-known <em>tech books </em>for software engineers.</li><li><a href="https://www.springer.com/gp?ref=blog.pragmaticengineer.com"><strong><u>Springer</u></strong></a>: another massive publisher for whom tech books are a small part of the business. I couldn&#x2019;t find how to pitch tech books to them.</li><li><a href="https://booksite.mkp.com/?ref=blog.pragmaticengineer.com"><strong><u>Morgan Kaufmann</u></strong></a>: a well-known tech books publisher founded in 1984, and acquired in 2001 by Elsevier. As I understand, these days it prints far fewer technology book, and focuses on academic topics. No clear way to pitch to them.</li></ul><h4 id="highly-reputable-%E2%80%9Cniche%E2%80%9D-publishers">Highly reputable &#x201C;niche&#x201D; publishers</h4><p>The following publishers are standout in quality, covering fewer topics than those above.</p><ul><li><a href="https://nostarch.com/?ref=blog.pragmaticengineer.com"><strong><u>No Starch Press</u></strong></a>: &#x201C;The finest in geek entertainment&#x201D; is the tagline, featuring fun visuals, and high-quality content on specific technologies like machine learning, Python, JavaScript, etc.</li><li><a href="https://itrevolution.com/?ref=blog.pragmaticengineer.com"><strong><u>IT Revolution</u></strong></a>: titles for technology leaders: DevOps, technology delivery, workplace culture, and similar. Publisher of The Phoenix Project, Team Topologies, and Accelerate.</li><li><a href="https://www.artima.com/books?ref=blog.pragmaticengineer.com"><strong><u>Artima</u></strong></a>: focuses on Scala.</li><li><a href="https://www.routledge.com/go/crc-press?ref=blog.pragmaticengineer.com"><strong><u>CRC Press</u></strong></a>: publishes on technology, engineering, math, and medicine.</li><li><a href="https://press.stripe.com/?ref=blog.pragmaticengineer.com"><strong><u>Stripe Press</u></strong></a>: &#x201C;works about technological, economic, and scientific advancement.&#x201D;</li><li><a href="https://mitpress.mit.edu/?ref=blog.pragmaticengineer.com"><strong><u>MIT Press</u></strong></a>: &#x201C;a distinctive collection of influential books curated for scholars and libraries worldwide.&#x201D;</li></ul><h4 id="other-mainstream-book-publishers">Other mainstream book publishers</h4><p><a href="https://www.apress.com/?ref=blog.pragmaticengineer.com"><strong><u>Apress</u></strong></a> is a reputable publisher with a lower profile, which publishes on a wide range of topics, from specific technologies and frameworks, to more generic topics on computing. Because they publish many books on many topics, they are usually open to pitches.</p><p><a href="https://www.packtpub.com/?ref=blog.pragmaticengineer.com"><strong><u>Packt</u></strong></a>. A tech book publisher with a focus on quantity over quality, it feels to me. There is limited support and feedback for authors, and titles could often use more editing. But also, Packt is likely to say &#x201C;yes&#x201D; to a serious proposal.</p><h2 id="2-financials-of-publishing">2. Financials of publishing</h2><p>Financial matters really come into play when your proposal is accepted by a publisher and you receive a contract offer.</p><p><strong>Advance: $2,000 &#x2013; $5,000. </strong>An advance payment to the writer is a tried and tested way to make them deliver a completed manuscript. It&#x2019;s often paid in chunks: 50% when a milestone is hit, and 50% when a full draft appears.</p><p>The &#x201C;big three&#x201D; publishers typically offer $5,000, usually as a flat, non-negotiable rate; at least, it&#x2019;s what I was offered. Smaller publishers offer closer to $2,000 for more niche books. The advance is non-refundable; even if your book sells zero copies, you keep it. The publisher is making an investment in you, and taking a risk.</p><p><em>As an aside: if you are thinking of writing a book: for guest authors in The Pragmatic Engineer Newsletter guest authors I offer a $4,000 per article payment &#x2013; and you can later publish your guest article in a book. Several authors working on their book have written a guest articles such as Lou Franco on </em><a href="https://newsletter.pragmaticengineer.com/p/paying-down-tech-debt?ref=blog.pragmaticengineer.com"><em><u>Paying down tech debt</u></em></a><em> or Apurva Chitnis on </em><a href="https://newsletter.pragmaticengineer.com/p/thriving-as-a-founding-engineer?ref=blog.pragmaticengineer.com"><em><u>Thriving as a founding engineer</u></em></a><em>. Writing a guest post can help refine ideas, broaden your reach, and prove helpful when publishing the article.</em></p><h4 id="paperback-royalty-7-15"><strong>Paperback royalty:</strong> 7-15%&#xA0;</h4><p>Royalties are earned on book sales, and taken from the net price of the book. Net price is what a publisher gets after the retailer (e.g. Amazon, or a bookshop) takes their cut. Let&#x2019;s see how it works for a $40 book:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/Screenshot-2025-11-11-at-11.11.21.png" class="kg-image" alt loading="lazy" width="1396" height="1080" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/Screenshot-2025-11-11-at-11.11.21.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/Screenshot-2025-11-11-at-11.11.21.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/Screenshot-2025-11-11-at-11.11.21.png 1396w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The royalty from a $40 book that has a 10% royalty can be anywhere from $4 to around $1.80, depending on the channel it was sold on. It all depends on how much revenue the publisher received after the sale.</em></i></figcaption></figure><p>It matters financially where your title is purchased; be it an online shop, physical book store, or purchased directly from the publisher. Many tech books are sold on Amazon and online stores. Amazon&#x2019;s 40% cut seems high, but it&#x2019;s actually the lowest among book retailers. Up to 60% is a common cut for a physical bookshop.</p><p>Most publishers offer 10-12.5% royalties, is my understanding, and Packt around 15-20%. Keep in mind that brand reputation plays a role; for example, Packt&#x2019;s reputation is less elevated than Manning, which can make a difference to sales.</p><h4 id="ebook-royalties-10-25">Ebook royalties: 10-25%</h4><p>For ebooks, several publishers pay 25% royalties, but not all. But even with a higher royalty rate, an author might end up making less per sale. For example, on the Kindle platform, the cut for Amazon is high at 65%. Let&#x2019;s look at a $30 ebook with a 20% royalty rate:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/Screenshot-2025-11-11-at-11.12.30.png" class="kg-image" alt loading="lazy" width="1412" height="914" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/Screenshot-2025-11-11-at-11.12.30.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/Screenshot-2025-11-11-at-11.12.30.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/Screenshot-2025-11-11-at-11.12.30.png 1412w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">ebooks are cheaper, but authors can earn more with this royalty structure. Selling on Kindle version is the least profitable because Amazon takes 65% of any sale above $10</em></i></figcaption></figure><p>Ebooks are almost always priced lower than physical books, and when sold on Kindle, generate much less revenue for the author, while earning more per copy than the paperback version. <em>I was offered 10% royalties on ebook sales, which is at the low end.</em></p><h4 id="%E2%80%9Cearning-out%E2%80%9D">&#x201C;Earning out&#x201D;&#xA0;</h4><p>When an author needs to pay back an advance before being paid anything, this is called &#x201C;earning out&#x201D;. If you get a $5,000 advance for a title costing $40 per hard copy and $25 for the ebook version, and most sales happen on Amazon, it means:</p><ul><li>~2,080 paperback sales on Amazon</li><li>Or ~2,850 Kindle book sales</li><li>Or ~1,250 paperback sales on the publisher website</li></ul><p>The author needs to sell at least 1,000 copies across various platforms to &#x201C;earn out.&#x201D; The good news is that a publisher sends quarterly or annual royalty payments if a book keeps generating revenue, which would effectively be passive income.</p><h4 id="the-prags%E2%80%99-unique-approach">The Prags&#x2019; unique approach</h4><p>One publisher that calculates rates differently is The Pragmatic Bookshelf. Instead of offering a low-digit number on <em>revenue</em>, they offer a 50% split on <em>profit</em>.</p><p>50% on profit sounds much higher than 10% on revenue, right? However, the devil is in the details, because paying on profit means that the upfront publisher costs &#x2013; editors, cover design, printing, distribution, marketing &#x2013; all are deducted before any profit split.</p><p>Authors who have used this approach tell me the numbers end up pretty similar to the revenue model.</p><h4 id="real-world-case-studies-with-actual-earnings">Real-world case studies with actual earnings</h4><p><a href="https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html?ref=blog.pragmaticengineer.com"><u>Designing Data Intensive Applications</u></a> author, Martin Kleppmann, shared the cumulative royalties he made in 6 years. The breakdown is interesting; ebook and Safari Online sales generated more revenue for the writer than the print version.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png" class="kg-image" alt loading="lazy" width="1400" height="800" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/data-src-image-d53acdad-d884-432a-b9d0-75cb7ea68141.png 1400w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Cumulative royalties for Designing Data Intensive Applications, published by O&#x2019;Reilly. Image source: </em></i><a href="https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html?ref=blog.pragmaticengineer.com"><u><i><em class="italic underline" style="white-space: pre-wrap;">Martin Kleppman&#x2019;s site</em></i></u></a></figcaption></figure><p><a href="https://rothgar.medium.com/the-economics-of-writing-a-technical-book-689d0c12fe39?ref=blog.pragmaticengineer.com"><u>Cloud Native Infrastructure earnings</u></a>: author Justin Garrison published with O&#x2019;Reilly, and was offered 10% for print and 25% for ebooks (split into half, thanks to working with a coauthor). His book sold 1,337 copies in 4 months; and made about $22,000 for the two authors (and around $11,000 for Justin.) Justin concluded:</p><blockquote>&#x201C;Going into this project I had a rough estimate in my head to make about $2000&#x2013;3000 so this is much better than I expected. Set your expectations accordingly.&#x201D;</blockquote><p><strong>Don&#x2019;t forget that publishers are also in this to make a positive return.</strong> This means that it is unlikely for a highly reputable publisher to invest into a book that they do not believe would sell at least a few thousand copies. I don&#x2019;t have the data here: but if I was a publisher, I would reject any book that didn&#x2019;t look like it could hit 1,000 copies sold in the first year of publishing.</p><h2 id="3-the-publishing-process-and-publisher-roles">3. The publishing process, and publisher roles</h2><p>Why does a publisher take so much of the revenue? Part of this is because they do a lot of the work around publishing, and need to hire (and pay!) people for those roles. Here is my understanding of how the publishing process works, based on four months of pitching to publishers; two months of working with one of them; and researching how the rest of the process works:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png" class="kg-image" alt loading="lazy" width="1320" height="1568" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/data-src-image-c9be774c-598b-4bac-81a7-3562aefcf63a.png 1320w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">My understanding of the publishing process, when working with a publisher. You probably get to work with quite a few specialized folks!</em></i></figcaption></figure><p>Here are people I worked with, and my experience with them:</p><p><strong>The acquisitions editor. </strong>If you write a technical blog, you might get a reachout from someone called an acquisitions editor, who will ask if you would consider publishing a book. Also, when you submit a pitch to a publisher, chances are that you will first communicate with an acquisitions editor.</p><p>A publisher&#x2019;s goal is to publish books that will be profitable for them. They find authors who could write these books two ways:</p><ul><li>Inbound pitches coming from authors &#x2013; reviewed by editors or acquisitions editors</li><li>External reachouts done by acquisitions editors</li></ul><p>These people need to have a good understanding of what kinds of books sell well at the publisher (and why); what their current catalogue is; what the gaps are; and what competitor publishers are commissioning.</p><p>When I pitched my book to 3 respected publishers, in two cases I talked with (and worked with) the acquisitions editor to improve my pitch. The acquisitions editors were my &#x201C;champions&#x201D; at the publisher. Their goal was to get a pitch that the company <em>would</em> say yes to.</p><p><strong>The development editor </strong>works on the <em>structure</em> of the book. They ask the author to come up with a detailed table of contents &#x2013; in my case, they asked me to estimate even the length of the chapters. They also help develop &#x2013; and maintain &#x2013; the narrative of the book.</p><p>Had I not worked with a publisher, I would have had no appreciation of this &#x201C;high-level editing&#x201D; &#x2013; which, turns out, is key for writing a well-structured tech book!</p><p><strong>The project manager</strong> checks in with timelines, organizes reviews&#x2014;like editorial reviews&#x2014;and helps keep you accountable. One of the best things about working with a publisher is that you are on a tight deadline&#x2014;without which it would take you several times longer to publish the book!</p><p><strong>The publisher owns a lot of rights for your book! </strong>One thing that I realized only after signing with a publisher is that while publishers help a lot with writing the book &#x2013; and taking a higher cut is sensible because of this &#x2013; they also hold on to a lot of rights that impact your book! These are all things that you give up on, versus when self-publishing. These are:</p><ul><li><strong>Global publishing rights</strong>. Although you are the author of the book &#x2013; and usually hold the copyright to it &#x2013; the publisher own wordlwide publishing rights. This means that they are the only ones who can publish the book, or longer excerpts of it. In practice, this means you need to get permission if you&#x2019;d like to publish some parts of your book on e.g. your blog, or social media. <em>They&#x2019;ll usually grant this as it&#x2019;s good marketing &#x2013; but it&#x2019;s still that you need to ask, as the author.</em></li><li><strong>Foreign rights. </strong>The publisher will own the publishing right, and will usually be the one who owns selling foreign rights. In theory, this could sound like you are losing out on things. In pratice, publishers are much better positioned to sell and administer these rights. Most publishers offer a 50% cut on these rights &#x2013; it&#x2019;s what my publisher offered. <em>Also, the majority of tech books are not translated to other languages &#x2013; a book that &#x201C;only&#x201D; sells 2,000 copies in English is unlikely to sell a significant number in a non-English market!</em></li><li><strong>The cover. </strong>The publisher decides what cover they will design, though they tend to check the author for feedback.</li><li><strong>The title.</strong> One of the surprises for me was how the publisher <em>ultimately</em> decides on the title and subtitle.</li></ul><p>In short: this book is owned by the publisher. You are the author, but they are the only ones who can distribute it. In practice, many authors would prefer to have it this way &#x2013; because all the work related to distributing the book is taken on by the publisher. However, it&#x2019;s good to know that you need to give up all the above when working with a publisher.</p><h2 id="4-my-book-pitch">4. My book pitch</h2><p>My secret hope, back in 2019, was to get a contract with one of the &#x201C;Big 3&#x201D; tech book publishers: O&#x2019;Reilly, Manning or The Prags. I pitched my book to all three: got a &#x201C;no&#x201D; from two, but a &#x201C;yes&#x201D; (and a contract) from one. Here&#x2019;s how I went about my pitch.</p><h4 id="write-a-%E2%80%9Cone-pager%E2%80%9D-about-your-book">Write a &#x201C;one-pager&#x201D; about your book</h4><p>What will this book be about? Who is it for? What will readers take away when reading it? Answer these in a short pitch, <em>before</em> even seeking out publishers. Here&#x2019;s what I put together as my &#x201C;one-pager:&#x201D;</p><h4 id="do-some-market-research">Do some market research</h4><p>What are similar books in the market that would be competing with this book, directly or indirectly? How is this book different from them?</p><p>What is the demographic of people who would be interested in buying this book? Can you estimate how large this crowd is? Realistically, what percentage of this group could be interested in buying the book &#x2013; assuming they know about it? <em>Don&#x2019;t forget that publishers will invest into books that can generate decent sales: it&#x2019;s good to do a little research to help confirm your title could be one of these!</em></p><h4 id="shortlist-publishers-you-would-be-interested-working-with">Shortlist publishers you would be interested working with</h4><p>While there are quite a few publishers out there: what are your top preferences? And what are ones you&#x2019;re willing to consider, even if your &#x201C;top&#x201D; choices turn you away?</p><p>Self-publishing is always an option (I&#x2019;ll cover more on how I went about this in later parts). However, going with a good publisher can significantly speed up your book production, while also improving the quality.</p><h4 id="write-a-draft-table-of-contents-and-a-draft-chapter">Write a draft table of contents and a draft chapter</h4><p>Some publishers will want to look at what a draft chapter will look like &#x2013; but not all of them. Still, I found it helpful to do writing before submitting to a publisher. If for no other reason, this was to confirm that I&#x2019;d enjoy longform writing!</p><p>I spent about a week putting together a table of content, and around four months writing drafts of chapters. These chapters turned out to be helpful later on.</p><h4 id="submit-a-tailored-pitch-your-the-publishers">Submit a tailored pitch your the publisher(s)</h4><p>Once you identified your top publisher choices, submit a pitch. Most book publishers have a pitch document they want you to follow. Here are common ones:</p><p><a href="https://www.oreilly.com/work-with-us.html?ref=blog.pragmaticengineer.com"><u>O&#x2019;Reilly&#x2019;s pitch template:</u></a></p><ul><li>Description</li><li>About the topic</li><li>Audience</li><li>Keywords</li><li>Competing titles</li><li>Related O&#x2019;Reilly titles</li><li>Book outline</li><li>Writing schedule</li></ul><p><a href="https://www.manning.com/write-for-us?ref=blog.pragmaticengineer.com"><u>Manning&#x2019;s pitch template:</u></a></p><ul><li>About the author</li><li>About the book topic</li><li>The book plan</li><li>Q&amp;A</li><li>Reader overview</li><li>Book competition</li><li>Book length and illustrations</li><li>Writing schedule</li><li>Table of contents</li></ul><p><a href="https://pragprog.com/publish-with-us/resources/PragProg_Proposal_Template.txt?ref=blog.pragmaticengineer.com"><u>The Pragmatic Bookshelf template:</u></a></p><ul><li>Overview</li><li>Outline</li><li>Bio</li><li>Competing books</li><li>PragProg books</li><li>Market size</li><li>Promotional ideas</li><li>Writing samples</li></ul><p>Most of these templates ask for similar content, so if you completed one pitch: the others are much easier. Here are some tips I&#x2019;d have for building a pitch.</p><p><strong>Put yourself in the shoes of the publisher. </strong>This book is a <em>huge</em> deal to you: but it&#x2019;s just one of the dozens that the publisher will publish <em>just</em> this year. You want to write an <em>amazing</em> book: but the publisher wants to publish one that <em>will sell</em>.</p><p>And these are major differences! The publisher will care very much about competition for the book, and how their existing titles relate to them. Like a VC firm, a publisher will not want to fund two investments competing on the exact same market: so if the publisher recently published a book that is a deepdive on Go; they will almost certainly pass on the next one, no matter how good your pitch is.</p><p><strong>Pitching to several publishers parallel is totally fine and you should do it! </strong>This is one thing I wish I&#x2019;d done differently.<strong> </strong>In my mind, I was 100% certain that my first publisher-of-choice would jump on the opportunity to publish this book. I thus felt that it would be &#x201C;unfair&#x201D; if I pitched to other publishers, without hearing back.</p><p>In hindsight, as a first-time author, this strategy was a waste of time on my end. Most publishers are unlikely to take a risk on a first-time author with no books published in the past &#x2013; like I was in 2019. And so the likely outcome is rejection in most cases.</p><p>In my case, I spent about two and a half months waiting on the response from this first publisher. My acquisitions editor was championing the book &#x2013; making the case for the publisher to offer a contract &#x2013; but in the end, the publisher chose another book with a similar topic that was in their pipeline. This made perfect business sense for them &#x2013; but for me, I was spent waiting for months, instead of pitching the book to other publishers!</p><p><strong>My book pitch ended up being a helpful resource on my self-publishing journey. </strong>Even though I did not release with a publisher: pitching to publishers helped the book become an eventual success. It was for these reasons:</p><ul><li><strong>Defining the structure.</strong> I had my table of contents well thought-out by the time I submitted the pitch. This structure changed later, but it was a solid start.</li><li><strong>Positioning the book.</strong> I had a good idea of the &#x201C;competitive&#x201D; landscape, and what books my title would &#x201C;go up against.&#x201D; It also helped me focus on how my book is different to what is already out there.</li><li><strong>Forcing me to think about marketing. </strong>The Pragmatic Bookshelf asked for a section on promotional ideas. This forced me to think about where (and how) I would promote the book &#x2013; even before getting into the thick of writing. When going with a publisher, it&#x2019;s safe to assume that the publisher&#x2019;s brand will do some marketing. However, authors will still do the lion&#x2019;s share of marketing &#x2013; and it&#x2019;s good to think about this ahead of time.</li></ul><h2 id="5-working-with-a-publisher">5. Working with a publisher</h2><p>I got lucky with one of the three publishers, in the end. This publisher was looking for a book just like mine, right at that time! What happened was one of their best sellers had to be pulled from publication, for reasons outside the control of the publisher. Apparently, when my pitch arrived, they had just started a search for a book that could plug the hole &#x2013; and they saw my book being a perfect fit for a &#x201C;software career advice&#x201D; book.</p><p>At the time, this felt like great luck. In hindsight, my relationship with the publisher might have soured exactly because they were looking for me to write <em>a specific kind of book</em> that would be similar enough to this old book &#x2013; but I had no intention of doing so. <em>More on how things went sour in the section after this one.</em></p><p>From signing the contract, I worked with a publisher for about a month &#x2013; so I&#x2019;m not exactly the most experienced in this front. However, a couple of things stood out as strong positives &#x2013; and things that I &#x201C;lost&#x201D; when deciding to self publish, in the end.</p><p><strong>Strong pressure to write &#x2013; thanks to the contract. </strong>My contract had pretty strict deadlines included. We signed it on 11 January 2020, and these deadlines were part of the contract:</p><p>&#x201C;The Author shall prepare and deliver to the Publisher a machine-readable electronic copy of the manuscript for the Work, including all its illustrations, code listings, and exercises, as mutually agreed upon by the Publisher and the Author as follows:</p><p>- Not later than March 15, 2020, a partial manuscript for the Work totaling not less than one third of the planned finished Work.</p><p>- Not later than June 1, 2020, a partial manuscript for the Work totaling not less than two thirds of the planned finished Work.</p><p>- Not later than August 15, 2020, a draft of the complete manuscript for the Work suitable for review.</p><p>- Not later than September 1, 2020, the final, revised and complete manuscript for the Work acceptable to the Publisher for publication.&#x201D;</p><p>Talk about pressure! Also, my first payout was tied to reaching the first milestone &#x2013; which was delivering at least a third of the finished work. My publisher also set up regular check-ins to help me stay accountable. And this kind of pressure was good &#x2013; because without it, I would have pushed back writing, or got stuck on relatively trivial parts!&#xA0;</p><h2 id="6-breaking-up-with-the-publisher">6. Breaking up with the publisher</h2><p>While I greatly appreciated that a publisher took a chance on me, lots of things felt wrong from the start. A month into working together, I felt that things were getting worse, and not better.</p><p>The small things that I dismissed, in the beginning:</p><ul><li><strong>A (very) opinionated structure.</strong> This publisher had strongly opinionated templates I was told to use for all chapters. They included each chapter to start by stating what the reader will learn; and then summarize this at the end of the chapter. It wasn&#x2019;t how I imagined my book to be &#x2013; but it didn&#x2019;t seem I had a choice. I figured, I&#x2019;ll give it a go. The publisher knows better after all, as they&#x2019;ve done this hundreds of times. <em>Right</em>?</li><li><strong>Needing to ask for permission to share drafts on social media.</strong> I originally planned to share screenshots of some of the parts I am writing to get feedback as I go &#x2013; and to also increase visibility of the book. I thought that this is a no-brainer. Not only does this kind of &#x201C;early sharing&#x201D; makes the book better: but it will also make more people excited about the book, leading to more eventual customers. To my surprise, my contact at the publisher said I will need to ask for permission whether I can do this. Permission? For something that will market the book? Yes: because the publisher owns all publishing rights, including for the draft!</li><li><strong>I won&#x2019;t decide on what the title will be.</strong> I had strong opinions about what I&#x2019;d like the book&#x2019;s title to be. My publishing contact also had ideas on what they thought would be good to add to it &#x2013; like introducing the &#x201C;mentoring&#x201D; term either to the title or the subtitle: which was an idea I disliked. As I talked with them, it became clear that the publisher will set the final title: not me. Hmm &#x2013; odd, no? It&#x2019;s another reminder that, although it&#x2019;s my book: it&#x2019;s <em>really</em> the publisher&#x2019;s book, and they have the final say on all important decisions.</li><li><strong>Nudges to &#x201C;dumb down&#x201D; the book. </strong>My editor was giving more suggestions on how to edit the content to make it more &#x201C;beginner-friendly&#x201D; and suggested I introduce e.g. &#x201C;Alice and Bob&#x201D; examples to make it easier to digest the contents. <em>One of the recently best-selling books of the publisher heavily used Alice and Bob, and it seems the publisher thought it helped their sales.</em></li></ul><p><strong>The first major editorial review was where I decided we should part ways with the publisher. </strong>About a month-and-a-half in, the publisher pulled together several experienced editors, and offered suggestions on how I could improve the book. The suggestions were these:</p><ul><li><strong>Focus on reader engagement. </strong>Tell stories and develop them with emotion, mystery, aha moments, and unexpected conclusions. Tell the stories from the &quot;we&quot; or &quot;they&quot; perspective -- make stories team-oriented.</li><li><strong>Exercises.</strong> Develop exercises for use within the chapters (not just end) or a story about what happened when one person did the exercise.</li><li><strong>Mini-projects.</strong> Guide readers to discover and come to conclusions on their own (see Donald Saari story in What the Best College Teachers Do). Mini project topics: testing, architectures.</li><li><strong>Word of the day feature.</strong> Example: Dependency injection (what is it)? Scatter these across the book.</li><li><strong>Quotes.</strong> Include quotes from luminaries such as [Well-known-person 1] and [Well-known-person 2] that relate to advice given. Ask other [Publisher] authors to relate experience about how they followed similar advice and were successful.</li><li><strong>Tech map. </strong>Create a diagram of the current technology landscape. Example big-picture topics: architecture demystified, distributed systems demystified.</li></ul><p>While I appreciated the suggestions: I <em>hated</em> all of them. I saw what implementing them would do: they would turn this book &#x2013; which I already had reservations with the &#x201C;forced&#x201D; style on me &#x2013; to something I would <em>not</em> want to read. Much less write!</p><p>I envisioned writing a more matter-of-the-fact book that doesn&#x2019;t have exercises, &#x201C;mini projects&#x201D; or &#x201C;word of the day&#x201D; gimmicks.</p><p><strong>I sat down to reflect why I chose to work with a publisher, to start with.</strong> As an author, I&#x2019;m giving up a lot of things: editorial control, the bulk of revenue, all publishing rights&#x2026; and for what? For the publisher to make the process easier, and for the end result book to be better than if I was working alone.</p><p>But I felt that this book would be far <em>worse</em> if I continued with my publisher: and the only way to get it back to what I envisioned was if I spent a lot of time and energy pushing back on them.</p><p>It would cost me less energy to self-publish. So I decided to terminate my agreement because it didn&#x2019;t feel my publisher was helping write the book that I wanted to write.</p><p><strong>My publisher was understanding and professional in terminating the contract.</strong> I explained to them that all the feedback suggested they wanted to see a very different book to what I wanted to write. And that, frankly, I am not the author to write <em>that</em> kind of book.</p><p>Truth be told, I was embarrassed that I had wasted their resources &#x2013; working with their development editor and the editing team &#x2013; for these two months. At the same time, I was vocal in voicing to my editor that I was hesitant about this mandated style. I also made the decision that there is no point in continuing at the first <em>formal</em> feedback session. I&#x2019;m not sure I could have come to this conclusion any further, as I was still learning how this book publisher worked, up until that point.</p><p>To show how professional this team was, this is the termination letter they sent as a signed PDF:</p><p>&#x201C;This letter is in reference to our Publishing Agreement with you for [what would become The Software Engineer&#x2019;s Guidebook] dated January 11, 2020. By mutual agreement, we are terminating the publishing contract.</p><p>Since no advance was paid to you under the terms of this contract, all rights in the content you originally submitted will hereby return to you and we will consider this matter concluded.</p><p>The decision to cancel a project is never an easy one to make. We thank you for all the efforts on this project that you made and wish you the best in your future endeavors.&#x201D;</p><p><strong>At this point, I learned enough about publishers and myself to decide: I&#x2019;m doing it by myself. </strong>Having my book accepted by a major publisher gave external validation that there&#x2019;s a strong business case for The Software Engineer&#x2019;s Guidebook. And working with an opinionated publisher &#x2013; and continuously pushing back on styling suggestions made me realize that I already have my own opinonated style that I <em>like</em> using.</p><p>I did lose a very important thing by deciding to self-publish: the accountability of meeting a publishing deadline. Working with the publisher, this book would have been out fall 2020 or spring 2021. Self-publishing, I launched it November 2023.</p><p>One of the reasons for publishing my book two years later than it would have taken with a publisher was because I now <em>knew</em> I could no longer rely on a well-known publisher to lend my book their brand. For my book to have an even slim chance of being successful: I would have to compensate for the lack of being associated with a publisher, and fill the gap in marketing and awareness, leading up to the book launch.</p><p>Not having a publisher was a reason I started writing <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><u>The Pragmatic Engineer Newsletter</u></a> in August 2021 (a year-and-a-half after breaking up with this publisher) &#x2013; and the sudden success of this newsletter gave me less time to wrap up the book. At the same time, by the time the book was ready, there were plenty of people who looked forward to reading it: and many of them were already readers of the newsletter!</p><p>I&#x2019;ll cover more about how I went about the actual self-publishing process in a follow-up article, how the book ended up selling, and other learnings. Subscribe <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><u>to The Pragmatic Engineer</u></a> to get notified when it is out.</p><p></p>]]></content:encoded></item><item><title><![CDATA[The Pulse: Amazon layoffs – AI or economy to blame?]]></title><description><![CDATA[Amazon is doing more mass layoffs, claiming it wants to be more nimble. But are job losses really about US economic fears, and how Amazon’s retail business will be affected?]]></description><link>https://blog.pragmaticengineer.com/the-pulse-amazon-layoffs/</link><guid isPermaLink="false">690ccf1cece43400015a8f22</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 06 Nov 2025 16:40:34 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>last week&#x2019;s The Pulse</em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/" rel="noopener noreferrer nofollow"><em> an email you could send to your manager</em></a><em>.</em></p><hr><p>Online retail giant Amazon unexpectedly announced 14,000 job cuts earlier last week. The massive round of layoffs at the company follows other mass redundancies in recent years:</p><ul><li><strong>January 2023</strong>: 18,000 people <a href="https://newsletter.pragmaticengineer.com/i/70995338/amazon-to-lay-off-more-people-and-rescind-more-offers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">laid off</a>.</li><li><strong>March 2023</strong>: another 9,000 people <a href="https://www.businessinsider.com/amazon-layoffs-second-round-9000-job-jobs-2023-3?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a></li><li><strong>November 2023</strong>: hundreds of people <a href="https://www.businessinsider.com/amazon-layoffs-memo-hundreds-job-cuts-alexa-agi-team-2023-11?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a> inside the Alexa team, as Amazon was looking to shift Alexa more toward GenAI</li><li><strong>April 2024</strong>: hundreds of people <a href="https://www.businessinsider.com/amazon-job-cuts-aws-roles-cloud-computing-division-aws-2024-4?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">let go</a> from AWS</li></ul><p>Software engineers, unfortunately, seem hit hard by the latest layoffs: of 2,300 employees laid off in Washington State, 25% are software engineers, GeekWire <a href="https://www.geekwire.com/2025/amazon-layoffs-hit-software-engineers-hardest-in-washington/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">reports</a>. <em>We can only speculate about the ratio across the rest of the company, but if cuts at HQ are heavy on engineering, then things don&#x2019;t look promising for other locations, sadly.</em></p><p><a href="https://www.aboutamazon.com/news/company-news/amazon-workforce-reduction?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The memo</a> from Beth Galetti, Senior Vice President of People Experience and Technology, to workers didn&#x2019;t explain much:</p><blockquote>&#x201C;Some may ask why we&#x2019;re reducing roles when the company is performing well. Across our businesses, we&#x2019;re delivering great customer experiences every day, innovating at a rapid rate, and producing strong business results. What we need to remember is that the world is changing quickly. This generation of AI is the most transformative technology we&#x2019;ve seen since the Internet, and it&#x2019;s enabling companies to innovate much faster than ever before (in existing market segments and altogether new ones). We&#x2019;re convinced that we need to be organized more leanly, with fewer layers and more ownership, to move as quickly as possible for our customers and business.&#x201D;</blockquote><p>The statement is utterly confusing, as encapsulated by its message that &#x201C;business is great, but we need to do layoffs&#x201D;. Job cuts usually mean a business is in trouble, which obviously isn&#x2019;t the case for Amazon. So, why are these layoffs <em>really</em> happening?</p><h3 id="layoffs-to-boost-efficiency">Layoffs to boost efficiency?</h3><p>The company&#x2019;s memo states:</p><blockquote>&#x201C;We&#x2019;re convinced that we need to be organized more leanly, with fewer layers and more ownership, to move as quickly as possible for our customers and business.&#x201D;</blockquote><p>If this line of reasoning sounds familiar, it&#x2019;s because most of the layoffs in 2023 were justified the same way. The tech industry overhired during the pandemic in 2020-2021, making orgs more bloated and decision-making slower. In February 2023, I reported on <a href="https://newsletter.pragmaticengineer.com/p/the-scoop-38?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the trend of fewer middle managers</a>, with Meta the first Big Tech giant to reduce its management layers. In 2023, most of Big Tech followed this approach with layoffs or reorgs. Managers acquired more reports, and tech companies cut down the number of layers between the CEO and individual contributors.</p><p>Given Amazon did other massive layoffs in 2023, it&#x2019;s unlikely they missed the industrywide trend for fewer managers. While the current layoffs seem to be targeting managers quite a bit &#x2013; from the Washington State layoffs, 20% of those let go <a href="https://www.geekwire.com/2025/amazon-layoffs-hit-software-engineers-hardest-in-washington/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">are managers</a> &#x2013; there are still more ICs laid off than managers, overall. So, this official explanation doesn&#x2019;t pass my personal &#x201C;smell test&#x201D;.</p><h3 id="layoffs-to-buy-more-gpus">Layoffs to buy more GPUs?</h3><p>The day after its jobs announcement, Amazon had more big news, this time about AI: it unveiled Project Rainer, the largest AI computing platform AWS has ever built. It already has 500,000 Trainium2 chips (built by Amazon), This capacity is already 70% larger than any AI computing platform in AWS&#x2019;s history, and Anthropic is using all of it (!!) to train its next models. Below is an image of one of the several Project Rainer data centers packed with Amazon GPUs:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp" class="kg-image" alt loading="lazy" width="1082" height="1072" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 1000w, https://blog.pragmaticengineer.com/content/images/2025/11/8e4b376d-4eb8-4c52-bcb4-aa935456e229_1082x1072.webp 1082w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">The next generation of Claude models is trained in these data centers. Source: </em></i><a href="https://x.com/ajassy/status/1983616724642730217?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Amazon</em></i></a></figcaption></figure><p>Building data centers is incredibly capital-intensive: Amazon has <a href="https://www.cnbc.com/2025/10/29/amazon-opens-11-billion-ai-data-center-project-rainier-in-indiana.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">spent</a> $11B on Project Rainer alone. Even though very profitable, Amazon might want to invest <em>more</em> cash than it currently has into building data centers. So, one reason for job cuts could be to reallocate financial resources from paying salaries and compensation towards building more data centers.</p><p>Before doing the math, a couple of concepts are important to understand:</p><ul><li><strong>Free cash flow (FCF)</strong>: profit after payment for things like capital expenditure (CapEx), such as financing data centers. If a company wants to operate with as little debt as possible, FCF is usually very important. If Amazon wants to avoid loans and not touch its reserves, then data center investment would come from FCF, reducing FCF further.</li><li><strong>Cash reserves: </strong>a company&#x2019;s <em>liquid</em> reserve investments, usually an accumulation of investments in financial instruments like bonds and securities, or cash deposits.</li></ul><p>Let&#x2019;s run Amazon&#x2019;s numbers:</p><ul><li><strong>Cash reserves: $93B. </strong>This is how much Amazon has in reserve.</li><li><strong>FCF: $32B. </strong>This is the rough free cash flow Amazon has currently, as per its latest quarterly report. This is after deducting <em>current</em> data center investments.</li><li><strong>Savings from layoffs: $2-4B. </strong>This is my estimate of the rough total compensation of 14,000 employees.</li></ul><p>So, the savings from these layoffs wouldn&#x2019;t even pay for half of Project Rainer ($11B in total), and Amazon could easily build 3x Project Rainers in the next year, without needing to dip into its savings! Of course, Amazon has its famous frugality principle, but this massive layoff of 14,000 people won&#x2019;t make a big difference to how much it can invest in data centers; It can already spend much more, if it wants!</p><h3 id="leanness-and-ai-fail-job-cuts-%E2%80%9Csmell-test%E2%80%9D">Leanness and AI fail job cuts &#x201C;smell test&#x201D;</h3><p>It&#x2019;s not only me who doesn&#x2019;t buy the explanation that these layoffs are to streamline the company, or to redirect resources to AI. <a href="https://www.linkedin.com/in/arneknudson/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Arne Knudson</a> worked at Amazon for nearly two decades, most recently as a software development manager (SDM), before leaving the company earlier this year. He <a href="https://www.linkedin.com/posts/arneknudson_in-my-18-years-at-amazon-i-went-through-activity-7388737736590909440-wJ1M?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAAIk0KwBsmE3oBadWSg2ettxmEyKbqZKG34" rel="noopener noreferrer nofollow">shared</a> his analysis, with some insider detail:</p><blockquote>&#x201C;In my 18 years at Amazon, I went through a few layoffs and hiring freezes.<br><br>This is the first time I&#x2019;ve seen multiple years of significant layoffs essentially back-to-back. Even in the depths after the .com bubble, it wasn&#x2019;t this bad. They&#x2019;ve been laying people off now for almost 3 straight years.<br><br><strong>The explanation that this is downsizing after hiring too many at the height of the pandemic doesn&#x2019;t pass the smell test, at least to me. </strong>That was 3 years ago; they&#x2019;re not that dumb to keep those people around for 3 extra years. Those folks were laid off back in &#x2018;22.<br><br><strong>I&#x2019;m also not convinced that this is optimization due to AI.</strong> My degree&#x2019;s in AI, and I worked on AI stuff at Amazon; I don&#x2019;t think there&#x2019;s enough automation yet, and it&#x2019;s not accurate enough yet, to replace 30,000 people. The cost of inaccuracies seems too high. But I could be wrong; maybe they&#x2019;ve gotten their false negative &amp; false positive rates low enough to avoid too many region-wide AWS outages. (Or not.)<br><br>One of the articles I read said this was going to be in HR, and I can tell you as a former manager, my experience working with HR had been steadily worsening over the past 5-7 years. They outsourced so much of the work, overworked the people they had, and had such high turnover that I never knew who I was supposed to work with. When I needed to put someone on a performance plan or help a new hire receive some kind of accommodation, it seemed like it was a different person each time. If they really are laying off tens of thousands more HR folks, this is only going to get worse.<br><br><strong>And, I suspect, it means they don&#x2019;t plan on hiring MORE people in any of the business units for a year or more. </strong>So, by the smell-o-meter, this seems more significant than streamlining the workforce, improved AI, and &#x201C;nah, we don&#x2019;t need as many HR folks.&#x201D;</blockquote><h3 id="us-economy-to-blame-for-amazon-layoffs">US economy to blame for Amazon layoffs?</h3><p>It&#x2019;s safe to assume AWS as a business unit is doing just fine, as suggested by Project Rainer&#x2019;s existence and the agenda for building data centers. But how is the e-commerce side of the business performing, and what&#x2019;s its outlook?</p><p>If one business should have its finger on the pulse of the US economy, it&#x2019;s Amazon with its size and self-professed, relentless customer focus, providing a window into people&#x2019;s spending habits across the country. Flashing lights on the dashboard of the national economy may signal tough times ahead in e-commerce, which could be a reason to start cutting costs early.</p><p><strong>There are concerning signs from other sectors about the US economy. </strong>Below is the CEO of the restaurant chain, Chipotle.</p><blockquote>&#x201C;Earlier this year, as consumer sentiment declined sharply, we saw a broad-based pullback in frequency across all income cohorts.<br><br>Since then, the gap has widened, with low to middle-income guests further reducing frequency. We believe that this guest with household income below $100,000, represents about 40% of our total sales. And, based on our data, they are <strong>dining out less often due to concerns about the economy, and inflation.</strong><br><br>A particularly challenged cohort is the 25- to 35-year-old age group. We believe that this trend is not unique to Chipotle and is occurring across all restaurants as well as many discretionary categories. This group is facing several headwinds, including unemployment, increased student loan repayment and slower real wage growth. We tend to skew younger and slightly over-indexed to this group relative to the broader restaurant industry&#x201D;.</blockquote><p>Chipotle is saying that everyone is eating out less, particularly 25-35 year olds, because of inflation. If people spend less on Chipotle because of rising prices, then they may also spend less in other areas of their lives for the same reason, including on Amazon.</p><p>In the e-commerce supply chain, there&#x2019;s evidence of this trend, which would mean delivery services like UPS have fewer parcels to deliver. Speaking of UPS, two days ago, it <a href="https://nypost.com/2025/10/28/business/ups-axes-48000-workers-in-sweeping-cost-cut-push-sparking-stock-surge/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">announced</a> massive layoffs:</p><blockquote>&#x201C;United Parcel Service (UPS) has slashed 48,000 jobs this year &#x2014; one of the largest single-year reductions by a US company since the pandemic &#x2014; as the package giant scrambled to contain costs and revive its lagging stock price.<br><br>The Atlanta-based delivery behemoth disclosed the reductions Tuesday while reporting third-quarter earnings that beat Wall Street expectations.<br><br>UPS said 34,000 of the cuts hit drivers and warehouse operations, while 14,000 targeted management (...)<br><br>UPS shares jumped nearly 9% in Tuesday afternoon trading, even as the company reported weaker revenue and profits&#x201D;.</blockquote><p>UPS&#x2019;s revenue is down on last year, which suggests that there are, indeed, fewer deliveries (or lower value ones). As with the latest job cuts at Amazon, these drastic layoffs could be explained by a lot of things, most easily by UPS expecting reduced trade in the future.</p><p><strong>If US consumer spending is trending down, then the e-commerce sector will be among the first to feel this. </strong>It could explain why Amazon is making these layoffs now. It can also explain why Google, Meta, and Microsoft might not be seeing their businesses impacted: they&#x2019;re not involved in retail like Amazon is, and the AI sector <em>is</em> very much booming.</p><p>Among all of Big Tech, Amazon is best positioned to detect changes in US consumer spending. Google&#x2019;s and Meta&#x2019;s revenue is more dependent on advertising, and Microsoft&#x2019;s more on enterprise spend. Like Amazon, Apple is well placed to feel market changes with its range of smartphones and watches, and other consumer tech.</p><p>I believe Amazon is highly commercially rational, so it&#x2019;s worth understanding the <em>actual</em> reason for its second major mass layoffs in just two years, following deep cuts in 2023. I&#x2019;d put my money on this reason being the economy, and how Amazon probably expects customers to cut back their spending everywhere, including on Amazon.</p><hr><p>This was one out of the four topics covered in last week&#x2019;s The Pulse. <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full article.</a></p><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-150-cursor-and-github-double?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>This week&#x2019;s The Pulse</strong></a> additionally covers:</p><ol><li><strong>Cursor and GitHub double down on agents. </strong>Each company is focusing on agents: Cursor with its multi-agent mode, and GitHub with its &#x201C;Agent HQ.&#x201D; Cursor is increasingly a direct rival to GitHub.</li><li><strong>Industry pulse. </strong>Meta rolls out AI-assisted interviews, Cursor and Windsurf believed to be using Chinese open source AI models, South Korean government pays price of ignoring backup &#x201C;101,&#x201D; startups growing much faster in the US than in Europe, companies using AI tools buy more JIRA seats, a neat uptime service called Updog, and more.</li><li><strong>OpenAI inflating the bubble? </strong>OpenAI signs another massive deal with AWS based on predicted growth, and seeks taxpayer protection to borrow more.</li><li><strong>Large tech companies struggle to build their AI integrations</strong>. Apple admits failure to modernize Siri by paying Google $1B per year for its LLM. Perplexity to pay Snap $400M to be its AI search interface.</li><li><strong>How much do Directors of Engineering earn at startups?</strong> Data from Carta says it&#x2019;s more than any other Director: $215-230K at companies valued at $25-250M.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-150-cursor-and-github-double?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a></p>]]></content:encoded></item><item><title><![CDATA[Comparing interviews at 8 large tech companies]]></title><description><![CDATA[Puneet Patwari applied to 8 major tech companies, and received 6 offers. He compares his interview experiences at Meta, Amazon, Uber, and 5 other workplaces]]></description><link>https://blog.pragmaticengineer.com/comparing-interviews-at-8-large-tech-companies/</link><guid isPermaLink="false">6903a79017b0a200016fa3a2</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 30 Oct 2025 18:00:42 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one topic from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The Pulse #149</em></a><em>. Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p><a href="https://www.linkedin.com/in/puneet-patwari/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Puneet Patwari</a> recently accepted an offer to join Atlassian as a Principal Software Engineer. In three months, he did more than 60 interviews at 11 companies, he told me &#x2013; while dropping out of 3 more interview processes after accepting the Atlassian offer, including that of Meta. Following that endeavour, he has <a href="https://www.linkedin.com/posts/puneet-patwari_i-interviewed-at-google-uber-walmart-amazon-activity-7379375609354883072-Rkq1/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">compared</a> the interview processes of the largest companies:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://substackcdn.com/image/fetch/$s_!fd6d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F411215af-a63b-411f-9192-d6a7ef71481e_1390x1236.png" class="kg-image" alt loading="lazy" width="1390" height="1236"><figcaption><i><em class="italic" style="white-space: pre-wrap;">What each interview process was like. Source: </em></i><a href="https://www.linkedin.com/posts/puneet-patwari_i-interviewed-at-google-uber-walmart-amazon-activity-7379375609354883072-Rkq1/?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">Puneet Patwari</em></i></a></figcaption></figure><p>A few more observations that Puneet shared with me:</p><blockquote><strong>Amazon</strong>: the Amazon Hiring Manager round was one of the most unique I ever experienced. We got so engrossed in the discussion that it took 160 minutes instead of the scheduled 60 minutes! We had to take a break in between the interview process.<br><br><strong>Atlassian</strong>: The leadership craft (LC) &amp; values were two interview rounds which were very crucial in determining that I&#x2019;ll be levelled at the Principal level. Of course, the Systems Design interview was also key here. Atlassian puts a lot of emphasis on LC for Principal engineers.<br><br><strong>Salesforce</strong>: the system design round was based on the <em>actual</em> job requirement. It was a migration problem where the interviewer wanted to check if I can own a project end-to-end with customers at the centre of it.<br><br><strong>Confluent</strong>: when I say it was the most mentally demanding interview, what I mean is how every skill was tested with two interviews! So 2x data structures and algorithms (DSA), 2x System Design 2x behavioural interview rounds.<br><br><strong>I cannot stress enough how important behavioural interviews are at the Staff+ levels. </strong>Doing well on these interviews were decisive in getting Staff and Principal-level offers. Of course, you needed to do well on coding and systems design: but my sense was that the behavioural parts were make or break for levelling and getting an offer.</blockquote><p>A few things stand out to me from Puneet&#x2019;s account of his interviews at leading tech companies:</p><ul><li><strong>Algorithmical coding interviews are everywhere! </strong>For senior+ positions, you need to get really good at these, including challenging topics like dynamic programming. We cover how to perform well in these in the article, <a href="https://newsletter.pragmaticengineer.com/p/how-to-get-unstuck-during-coding-interviews?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">How experienced engineers get unstuck in coding interviews</a></li><li><strong>Interviews are tough, and time consuming. </strong>Even after Puneet had offers, no company shortened their process. Puneet had to decline 3 more interviews &#x2013; including one at Meta &#x2013; because by the time the interviews would have come around, he already had an offer he had accepted at Atlassian.</li><li><strong>In a tough job market, &#x201C;top&#x201D; candidates are still in demand. </strong>We&#x2019;ve covered <a href="https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025-hiring-managers?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">how challenging the current tech labor market is for jobseekers</a>, but Puneet interviewed at 11 companies and got 6 offers. His applications had to have a lot going for them in order to pass the resume screenings: 10+ years of experience, and working as a Senior Software Engineer at Microsoft. He also showed up <em>really</em> well prepared.</li><li><strong>Bad luck can strike at any time</strong>. Puneet&#x2019;s interview experience at Uber seems to have been a bit unlucky: the interviewer presented as rigid and not open to dialogue. Perhaps they were having a tough day, or wanted to get the interview over with. Or it could be what Steve Yegge describes as the <a href="https://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">interviewer anti-loop</a></li></ul><p>Congrats to Puneet for accepting the Atlassian position, and thanks for sharing all these learnings!</p><hr><p>This was one out of five topics covered in <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Pulse #149</a>. The full edition additionally covers:</p><ul><li><strong>New trend: programming by kicking off parallel AI agents. </strong>More devs are experimenting with kicking off coding agents in parallel</li><li><strong>ACP protocol.</strong> A new protocol built by the Zed team, which tries to make it easier to build AI tooling for IDEs than the MCP protocol allows</li><li><strong>AI security tooling works surprisingly well?</strong> AI-powered security tools seem good at identifying security flaws in mature open source projects</li><li><strong>Is AI the only engine of US economic growth?</strong> Forty percent of US GDP this year is based on AI-related spend, while 60% of venture capital goes into AI. Hopefully, it won&#x2019;t end up as a bubble which bursts like in 2001</li></ul><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a>, and check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">today&#x2019;s The Pulse here</a>.</p>]]></content:encoded></item><item><title><![CDATA[New trend: programming by kicking off parallel AI agents]]></title><description><![CDATA[More devs are experimenting with kicking off coding agents in parallel]]></description><link>https://blog.pragmaticengineer.com/new-trend-programming-by-kicking-off-parallel-ai-agents/</link><guid isPermaLink="false">6903a70817b0a200016fa357</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 30 Oct 2025 17:58:38 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one topic from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>The Pulse #149</em></a><em>. Full subscribers received the below article two weeks ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><hr><p>With agentic command line interfaces like Claude Code, OpenAI Codex, Cursor, and many others going mainstream, I&#x2019;m seeing a trend of more software engineers experimenting with kicking off work with several agents simultaneously on separate tasks:</p><p>I talked with Anthropic engineer Sid Bidasaria about <a href="https://newsletter.pragmaticengineer.com/p/how-claude-code-is-built?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">how Claude Code is built</a>, and at the end of our conversation, he mentioned that he&#x2019;d had a few agents running throughout and that it made him more productive with work. Similarly, software engineer Simon Willison, whom I consider an AI engineering expert, has posted about &#x201C;<a href="https://simonwillison.net/2025/Oct/5/parallel-coding-agents/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">embracing the parallel coding agent lifestyle</a>.&#x201D; He writes:</p><blockquote>&#x201C;For a while now, I&#x2019;ve been hearing from engineers who run multiple coding agents at once&#x2014;firing up several Claude Code or OpenAI Codex instances at the same time, sometimes in the same repo, sometimes against multiple checkouts or git worktrees.<br><br>I was pretty skeptical about this at first. AI-generated code needs to be reviewed, which means the natural bottleneck on all of this is how fast I can review the results. It&#x2019;s tough keeping up with just a single LLM given how fast they can churn things out, where&#x2019;s the benefit from running more than one at a time if it just leaves me further behind?<br><br>Despite my misgivings, over the past few weeks I&#x2019;ve noticed myself quietly starting to embrace the parallel coding agent lifestyle.<br><br>I can only focus on reviewing and landing one significant change at a time, but I&#x2019;m finding an increasing number of tasks that can still be fired off in parallel without adding too much cognitive overhead to my primary work.&#x201D;</blockquote><p>Simon <a href="https://simonwillison.net/2025/Oct/5/parallel-coding-agents/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">shares advice about what works for him</a>, with research, maintenance tasks, and directed work all mentioned as use cases.</p><p><strong>It&#x2019;s interesting to consider whether parallel work with agents has the potential to overturn decades of software engineering practices. </strong>Let&#x2019;s assume software engineers who kick off multiple agents at once do become more productive than &#x201C;single-threaded&#x201D; peers who work on one problem at a time. If so, then this practice has a chance to spread, should enough software engineers seek to be more productive &#x2013; or want to avoid being left behind by some colleagues doing more than before.</p><p>But engineering in the pre-AI era was all about being in the flow for many productive engineers. A flow state goes something like this:</p><ul><li>Understand the moving parts</li><li>Build a solution, validate it, iterate on it</li><li>When satisfied with how it works, submit a pull request for code review &#x2014; or, if no review is needed, just merge and ship it</li></ul><p>Interrupting this process disrupts the flow state, and it takes time to get back into it: it&#x2019;s why software engineers tend to prioritize focus time, to make progress with coding work.</p><p>Of course, this isn&#x2019;t universal among all highly productive engineers; when I was an engineering manager, the most productive engineers on my team did a lot of context switching and were adept at juggling several things at once. Here&#x2019;s an average-looking day for a senior engineer acting as a tech lead:</p><ul><li><strong>Code reviews. </strong>Arrive at office, go through open code reviews from the previous night</li><li><strong>Coding.</strong> Get some of their own coding work done</li><li><strong>Standup.</strong> The usual</li><li><strong>More coding.</strong> Get the work done. <em>At least, that&#x2019;s the idea. In reality:</em></li><li><strong>Interruptions: </strong>code reviews, requests for help, taps on shoulder. The most productive engineer on a team regularly gets messages requesting code reviews to unblock teammates, or to help someone else who&#x2019;s stuck, or the manager (me &#x2013; sorry!) tapping them on the shoulder for help with something.</li></ul><p><strong>I wonder if senior+ engineers will be &#x201C;naturals&#x201D; at working with parallel AI agents, </strong>based on their existing habits and what they do currently:</p><ul><li>Keep parallel workflows in their heads; e.g., what team members are doing at any one time.</li><li>Code reviews across several workstreams: they&#x2019;re the <em>go-to</em> code reviewer, and usually review all code changes across 2-5 workstreams. They may not do the work, but know when it&#x2019;s correct.</li><li>Can handle interruptions: they&#x2019;ve learned how to make progress when their focus is continually being broken.</li><li>Good at directing colleagues: because they&#x2019;re regularly interrupted, they&#x2019;ve also learned how to delegate and explain urgent work to team members.</li><li>Writing skill: these engineers write a lot of code reviews, draw up documents like RFCs that outline work, create tickets to break down projects, and critique colleagues&#x2019; efforts; all this involves communicating effectively in writing.</li></ul><p>With AI agents, the qualities that make a good tech lead are within reach for engineers who want to be more productive. So far, the only people I&#x2019;ve heard are using parallel agents successfully are senior+ engineers.</p><p>Then again, this workflow hasn&#x2019;t stuck with everyone: I asked Flask creator <a href="https://newsletter.pragmaticengineer.com/p/python-go-rust-typescript-and-ai?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Armin Ronacher</a> about his experience with parallel agents. He told me:</p><blockquote>&#x201C;I sometimes kick off parallel agents, but not as much as I used to do.<br><br>The thing is: it&#x2019;s only so much my mind can review!&#x201D;</blockquote><p><strong>But we&#x2019;re in new territory now that any dev can kick off parallel coding with coding agents. </strong>Will it make engineers more productive, or will it just make people <em>feel</em> like they&#x2019;re more productive? Perhaps engineers who do one thing at a time and keep focus will be shown to produce more reliable software, over time. Or maybe it&#x2019;ll turn out that working with parallel agents leads to more issues slipping through and more iterations, which destroys any gains.</p><p>We will find out. Personally, I can only see more devs experimenting with parallel agents.</p><p><strong>My sense is that software engineering basics matter more when working with AI agents. </strong>I&#x2019;ve started to use AI agents for my own side projects, with success so far. I do a few things:</p><ul><li>Testing: all side projects have unit tests because I learned to not trust my own work without validation</li><li>Small, descriptive tasks: I give tasks small enough in scope, which I explain, and give examples of</li><li>Refactoring: every third or fourth task is for the agent to refactor some code they wrote (e.g., extract into a method, move to a new class)</li><li>Review: I track what the agent does</li><li>Do small things personally: I keep my IDE open and do anything that&#x2019;s a few lines to change by hand, so I stay aware of the codebase</li></ul><p>I keep hearing the same from other engineers: &#x201C;mandating&#x201D; engineering practices like having the agent pass all tests before continuing, leads to better results. This is unsurprising and it&#x2019;s why these practices are getting popular. AI agents are non-deterministic and to some extent unreliable; these practices make them a lot more reliable and usable.</p><hr><p>This was one out of the five topics covered in <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Pulse #149</a>. The full edition additionally covers: The full issue additionally covers:</p><ul><li><strong>ACP protocol.</strong> A new protocol built by the Zed team, which tries to make it easier to build AI tooling for IDEs than the MCP protocol allows</li><li><strong>AI security tooling works surprisingly well?</strong> AI-powered security tools seem good at identifying security flaws in mature open source projects</li><li><strong>Is AI the only engine of US economic growth?</strong> Forty percent of US GDP this year is based on AI-related spend, while 60% of venture capital goes into AI. Hopefully, it won&#x2019;t end up as a bubble which bursts like in 2001</li><li><strong>Comparing interviews at 8 large tech companies.</strong> <a href="https://www.linkedin.com/in/puneet-patwari/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Puneet Patwari</a> applied to 8 major tech companies, and received 6 offers. He compares his interview experiences at Meta, Amazon, Uber, and 5 other workplaces</li></ul><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-149-new-trend-programming?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Read the full issue here</a>, and check out <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-151?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">today&#x2019;s The Pulse here</a>.</p>]]></content:encoded></item><item><title><![CDATA[What caused the large AWS outage?]]></title><description><![CDATA[On Monday, a major AWS outage hit thousands of sites & apps, and even a Premier League soccer game. An overview of what caused this high-profile, global outage]]></description><link>https://blog.pragmaticengineer.com/aws-outage-us-east-1/</link><guid isPermaLink="false">68fa5697b1a28700018a70af</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 23 Oct 2025 16:26:27 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from today&#x2019;s deepdive into </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>the recent AWS outage</em></a><em>. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><em>subscribe here</em></a><em>.</em></p><p>Monday was an interesting day: Signal stopped working, Slack and Zoom had issues, and most Amazon services were also down, together with thousands of websites and apps, across the globe. The cause was a 14-hour-long AWS outage in the us-east-1 region.</p><p>Today, we look into what caused this outage.</p><p>To its credit, AWS posted continuous updates throughout the outage. Three days after the incident, they <a href="https://aws.amazon.com/message/101925/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">released</a> a detailed postmortem &#x2013; much faster than <a href="https://newsletter.pragmaticengineer.com/p/three-cloud-providers-three-outages?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">the 4 months it took </a>in 2023 after a similarly large event.</p><p><strong>The latest outage was caused by DynamoDB&#x2019;s DNS failure. </strong>DynamoDB is a serverless NoSQL database built for durability and high availability, which promises <strong>99.99%</strong> uptime as its service level agreement (SLA), when set to multi-availability zone (AZ) replication. Basically, when operated in a single region, DynamoDB promises &#x2013; and delivers! &#x2013; very high uptime with low latency. Even better, while the default consistency model for DynamoDB is <a href="https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">eventual consistency</a> (reads might not yet reflect the actual status), reads can also be set to use <a href="https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">strong consistency</a> (guaranteed to return the actual status).</p><p>All these traits make DynamoDB an attractive choice for data storage for pretty much any application, and many of AWS&#x2019;s own services also depend heavily on DynamoDB. Plus, DynamoDB has a track record of delivering on its SLA promises, so the question is often not <em>why</em> to use DynamoDB, but rather, <em>why not to</em> use this highly reliable data storage. Potential reasons for not using it include complex querying, complex data models, or storing large amounts of data when storage costs are not worth it compared to other bulk storage solutions.</p><p>In this outage, DynamoDB went down, and the <strong>dynamodb.us-east-1.amazonaws.com</strong> address returned an empty DNS record. To every service &#x2013; external to AWS or AWS internal &#x2013; it seemed like DynamoDB in this AWS region disappeared off the face of earth! To understand what happened, we need to look into DynamoDB DNS management.</p><h3 id="how-dynamodb-dns-management-happens">How DynamoDB DNS management happens</h3><p>Here&#x2019;s an overview:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/10/Screenshot-2025-10-23-at-14.30.07.png" class="kg-image" alt loading="lazy" width="1802" height="1408" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/10/Screenshot-2025-10-23-at-14.30.07.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/10/Screenshot-2025-10-23-at-14.30.07.png 1000w, https://blog.pragmaticengineer.com/content/images/size/w1600/2025/10/Screenshot-2025-10-23-at-14.30.07.png 1600w, https://blog.pragmaticengineer.com/content/images/2025/10/Screenshot-2025-10-23-at-14.30.07.png 1802w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">DNS management in DynamoDB. Can you see where a race condition could occur?</span></figcaption></figure><p>How it works:</p><ul><li><strong>DNS planner:</strong> this service monitors load balancer (LB) health. As you can imagine, DynamoDB runs at a massive scale, and LBs can easily get overloaded or under-utilized. When overloading happens, new LBs need to be added, and when there&#x2019;s underutilization, they need to be removed. The DNS planner created DNS plans. Each DNS plan is a set of LB sets, and assigning them weights on how much traffic to give the LB.</li><li><strong>DNS enactor: </strong>the service responsible for updating the routes in Amazon&#x2019;s DNS service called Route 53. For resiliency, there is one DNS enactor running in each availability zone (AZ). In us-east-1, there are 3 AZs, and 3x DNS Enactor instances.</li><li><strong>Race conditions are expected:</strong> with 3x parallel DNS Enactors working simultaneously, race conditions are expected. The system deals with this by assuming eventual consistency: even if a DNS Enactor updates Route 53 with an &#x201C;old&#x201D; plan, DNS plans are consistent with one another. Plus, updating happens quickly, and DNS Enactors only use the latest plans from the DNS Planner.</li></ul><h3 id="dynamodb-down-for-3-hours">DynamoDB down for 3 hours</h3><p>Several independent events combined to knock DynamoDB&#x2019;s DNS offline:</p><ol><li><strong>High delays on a DNS Enactor #1. </strong>Updating DNS took unusually long for one DNS Enactor for some reason. Usually, these updates are rapid, but weren&#x2019;t on 20 October.</li><li><strong>DNS Planner turns up the pace in churning out DNS plans. </strong>Just as DNS updates turned slow, the DNS planner started to produce new plans at a much higher pace than before.</li><li><strong>DNS Enactor #2 rapidly processes DNS plans. </strong>While DNS Enactor #1 was applying DNS plans at snail&#x2019;s pace, DNS Enactor #2 was storming through them. As soon as it finished writing these plans to Route 53, it went back to DNS Planner and deleted the old plans.</li></ol><p>These three things pushed the system into an inconsistent state and emptied out DynamoDB DNS:</p><ol><li><strong>DNS Enactor #1 unknowingly uses an old DNS plan. </strong>When DNS Enactor #2 finished applying the newest DNS plan, it went back to DNS Planner and deleted all older plans. Doing so <em>should</em> have meant that other DNS Enactors did not use old plans; but remember, DNS Enactor #1 was slow and still processing through an old plan! As a result, the check by DNS Enactor #1 was stale.</li><li><strong>DNS Enactor #2 detects the old plan being used and clears DNS records. </strong>DNS Enactors have another cleanup check: if they detect an old plan being used, they delete the plan itself. Deleting a plan means removing all IP addresses for the regional endpoints in Route 53. So DNS Enactor #2 turned the dynamodb.us-east-1.amazonaws.com DNS empty!</li></ol><p>DynamoDB going down also took down all AWS services dependent on the us-east-1 DynamoDB services. From the AWS postmortem:</p><blockquote>&#x201C;All systems needing to connect to the DynamoDB service in the N. Virginia (us-east-1) Region via the public endpoint immediately began experiencing DNS failures and failed to connect to DynamoDB. This included customer traffic as well as traffic from internal AWS services that rely on DynamoDB. Customers with DynamoDB global tables were able to successfully connect to and issue requests against their replica tables in other Regions, but experienced prolonged replication lag to and from the replica tables in the N. Virginia (us-east-1) Region.&#x201D;</blockquote><p>The DynamoDB outage lasted around 3 hours; I can only imagine AWS engineers scratching their heads and wondering how the DNS records were emptied. Eventually, engineers manually intervened, and brought back DynamoDB. It&#x2019;s worth remembering that bringing up DynamoDB might have included avoiding the <a href="https://newsletter.pragmaticengineer.com/i/168964142/mitigating-the-outage?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">thundering herd issue</a> that is typical of restarting large services.</p><p><strong>To be honest, I&#x2019;m sensing key details were omitted from the postmortem. </strong>Things unmentioned which are key to understanding what really happened:</p><ul><li>Why did DNS Enactor #1 slow down in updating DNS, compared to DNS Enactor #2?</li><li>Why did DNS Enactor #2 delete all DNS records as part of cleanup? This really made no sense, and it feels like there&#x2019;s an underlying reason.</li><li>The race condition of one DNS Enactor being well ahead of others seems to be easy enough to forecast. Was this the first time it happened? If not, what happened after previous, similar incidents?</li><li>Most pressingly, how will the team fix this vulnerability which could happen anytime in the future?</li></ul><h3 id="amazon-ec2-down-for-12-more-hours">Amazon EC2 down for 12 more hours</h3><p>With DynamoDB restored, the pain was still not over for AWS. In fact, Amazon EC2&#x2019;s problems just got worse. To understand what happened, we need to understand how EC2 works:</p><ul><li><strong>DropletWorkflow Manager (DWFM) </strong>is the subsystem that manages physical servers for EC2. Think of it as the &#x201C;Kubernetes for EC2.&#x201D; EC2 instances are called &#x201C;droplets.&#x201D;</li><li><strong>Lease</strong>: DropletWorkflow Manager tracks the lease for each droplet (server) so it knows if and when a server is occupied, or can be allocated to an EC2 customer. The DWFM does a status check of the server every few minutes to determine its state.</li></ul><p>State check results are stored in DynamoDB, so the DynamoDB outage caused problems:</p><p><strong>1. Leases started to time out. </strong>With state check results not returning due to the DynamoDB outage, the DropletWorkflow Manager started to mark droplets as not available.</p><p><strong>2. Insufficient capacity errors on EC2: </strong>with most leases timed out, DWFM started to return &#x201C;insufficient capacity error&#x201D; messages to EC2 customers. It <em>thought</em> servers were not available, after all.</p><p><strong>3. DynamoDB&#x2019;s return didn&#x2019;t help: </strong>when DynamoDB came back online, it should have been possible to update the status of droplets. But that didn&#x2019;t happen. From the postmortem:</p><blockquote>&#x201C;Due to the large number of droplets, efforts to establish new droplet leases took long enough that the work could not be completed before they timed out. Additional work was queued to reattempt establishing the droplet lease. At this point, DWFM had entered a state of <strong>congestive collapse</strong> and was unable to make forward progress in recovering droplet leases.&#x201D;</blockquote><p>It took engineers 3 more hours to come up with mitigations to get EC2 instance allocation working again.</p><p><strong>Network propagation errors took another 5 hours to fix.</strong> Even when EC2 was looking healthy on the inside, instances could not communicate with the outside world, and congestion built up inside a system called Network Manager. Also from the postmortem:</p><blockquote>&#x201C;Network Manager started to experience increased latencies in network propagation times as it worked to process the backlog of network state changes. While new EC2 instances could be launched successfully, they would not have the necessary network connectivity due to the delays in network state propagation. Engineers worked to reduce the load on Network Manager to address network configuration propagation times and took action to accelerate recovery. By 10:36 AM PT [11 hours after the start of the outage], network configuration propagation times had returned to normal levels, and new EC2 instance launches were once again operating normally.&#x201D;</blockquote><p><strong>Final cleanup took another 3 hours. </strong>After all 3 systems fixed &#x2013; DynamoDB, EC2&#x2019;s DropletWorkflow Manager and Network Manager &#x2013; there was a bit of cleanup left to do:</p><blockquote>&#x201C;The final step towards EC2 recovery was to fully remove the request throttles that had been put in place to reduce the load on the various EC2 subsystems. As API calls and new EC2 instance launch requests stabilized, at 11:23 AM PDT [12 hours after the outage started] our engineers began relaxing request throttles as they worked towards full recovery. At 1:50 PM [14 hours after the outage started], all EC2 APIs and new EC2 instance launches were operating normally.&#x201D;</blockquote><p>Phew &#x2013; that was a lot of work! Props to the AWS team for working through it all in what must have been a stressful night&#x2019;s work. You can <a href="https://aws.amazon.com/message/101925/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">read the full postmortem here</a>, which details the impact on other services like the Network Load Balancer (NLB), Lambda functions, Amazon Elastic Container Service (<strong>ECS</strong>), Elastic Kubernetes Service (<strong>EKS</strong>), <strong>Fargate</strong>, Amazon Connect,<strong> AWS Security Token Service</strong>, and AWS Management Console.</p><hr><p>This was one out of four topics from today&#x2019;s The Pulse, analyzing this large AWS outage. The <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">full issue additionally covers:</a></p><ol><li><strong>Worldwide impact. </strong>From Ring cameras, Robinhood, Snapchat, and Duolingo, all the way to Substack &#x2013; sites and services went down in their thousands.</li><li><strong>Unexpected AWS dependencies. </strong>Status pages using Atlassian&#x2019;s Statuspage product could not be updated, Eight Sleep mattresses were effectively bricked for users, Postman was unusable, UK taxpayers couldn&#x2019;t access the HMRC portal, and a Premier League game was interrupted.</li><li><strong>Why such dependency on us-east-1?</strong> It feels like half of the internet is on us-east-1 for its low pricing and high capacity. Meanwhile, some AWS services are themselves dependent on this region.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-aws-takes-down-a-good-part?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow"><strong>Read the full article here.</strong></a></p>]]></content:encoded></item><item><title><![CDATA[Creative ways to fund open source projects]]></title><description><![CDATA[“Open source maintenance fee” trialed by Wix Toolset, while the creator of uv offers paid, enterprise-only features for larger companies.
]]></description><link>https://blog.pragmaticengineer.com/creative-ways-to-fund-open-source-projects/</link><guid isPermaLink="false">68a74cca66ded000013c61da</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 21 Aug 2025 16:46:22 GMT</pubDate><content:encoded><![CDATA[<p>Every engineer and tech company likes open source projects, but few are willing to pay for them. This is fine for a while &#x2013; enthusiasm to support a project does go a long way &#x2013; but after some time, maintainers of popular projects could burn out or just stop supporting the project. Recently, I came across two creative approaches of maintainers generating revenue for open source projects that also see significant commercial usage.</p><h3 id="open-source-maintenance-fee-an-interesting-experiment">Open source maintenance fee: an interesting experiment</h3><p><a href="https://github.com/wixtoolset?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Wix Toolset</a> is a set of open source tools for creating Windows installers. As with most open source projects, the long-term financial sustainability of the project was uncertain: who would pay for core maintainers to spend time on this project, rather than on their work or other side projects? A project like Wix needs to be continuously updated to support new versions of Windows, and security and other issues must be responded to by maintainers, who also review pull requests.</p><p>The team decided to adopt a relatively new concept called the <a href="https://opensourcemaintenancefee.org/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Open Source Maintenance Fee</a>. How it works:</p><ul><li><strong>The code</strong> is freely available under an open source license</li><li><strong>Support: paid</strong>. Opening issues or commenting on them, or participating in releases or pull requests requires paying the fee</li><li><strong>Binary releases: paid</strong>. To download any binary release of the project, one needs to pay the fee. <em>But the binary can be compiled from the code - one just has to build it themselves.</em></li></ul><p>Basically, if an individual or business uses Wix Toolset as part of its revenue-generating activity, and someone from that company wants to ask questions or submit changes, or download a pre-built binary, then it&#x2019;s necessary to become a sponsor and pay:</p><ul><li>$10/month when working at a company up to 20 people</li><li>$40/month for companies of 20-100 employees</li><li>$60/month for companies with 100+ employees</li></ul><p>So far, the fee seems to be working:<strong> </strong>The project currently has 64 sponsors, including Microsoft (which needs to pay the $60/month fee). This could be generating anything from $640/month up to a few thousand dollars per month, which can be used to cover costs, such as compensating core contributors. As maintainer Rob Menshing <a href="https://x.com/robmen/status/1958562695881568595?ref=blog.pragmaticengineer.com"><u>elaborated</u></a>:</p><p>&#x201C;The maintenance fee is intended to go to those doing maintenance. In my project, there are two of us working to keep the project running. We have contributors who do stuff they find interesting, but none of them do &#x2018;boring but important stuff&#x2019;.</p><p>So far, it is working.&#x201D;</p><p>What I like about this structure is that it removes the burden on maintainers who invest their time, and turns it into an incentive for commercial users to pay for their time.</p><p><strong>The idea of the fee came after observing the XZ Utils supply chain attack incident. </strong>As Rob shared on the <a href="https://robmensching.com/blog/posts/2025/02/26/introducing-the-open-source-maintenance-fee/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">backstory of the Open Source Maintenance Fee</a>:</p><blockquote>&#x201C;I touched on it in the video but I felt compelled to do something about the sustainability problems facing Open Source Maintainers. My personal frustration dealing with entitled consumers for the last few years&#x2014;and watching other maintainers go through the same thing&#x2014;hit a breaking point with the XZ Utils incident.<br><br>Two things happened in that incident. First, we saw a maintainer&#x2014;who was vulnerable due to the lack of project sustainability&#x2014;manipulated by attackers to sneak a backdoor into Linux&#x2026; and then nothing changed. Second, in my viral tweet-thread everyone agreed that something should be done. I&#x2019;ve never seen that many people on the internet agree about anything.<br><br>The cognitive dissonance that something should be done but nothing was done, fundamentally bothered me. I couldn&#x2019;t stop thinking about the problem. It wasn&#x2019;t until early July that the Maintenance Fee idea really started to come together.&#x201D;</blockquote><p>And you know what - I agree! As an OSS maintainer, it can feel that users <em>are</em> entitled: demanding a lot, but offering nothing in return. More than a decade ago, I also burnt out maintaining a Windows Phone library called <a href="https://github.com/Adrotator/AdrotatorV2?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">AdRotator</a> that helped users generate revenue for their apps. I released my library thinking it will help others make more money, and everyone will sort out their own issues. I was not expected to see demands coming from users that I implement new features, or fix issued they are seeing. Back then, I was relieved to hand over maintenance to a fellow contributor who had far more energy to support users with questions and feature requests.</p><p>I <em>naively</em> assumed open source was about offering the source as open, and everyone taking it, and sorting out their issues by themselves. I was not prepared that by releasing something as open source, I&#x2019;d get a bunch of maintenance burden to have to deal with &#x2014; something I never signed up for or asked for.</p><p><strong>I wonder if GitHub&#x2019;s popularity means more projects will implement a similar fee. </strong>GitHub has become the de facto place for open source projects to live: its simple, intuitive UI makes it very easy to open an issue and raise a pull request.</p><p>However, open source maintainers of more popular projects face a steady stream of often low-quality issues raised, and pull requests which aren&#x2019;t ready to be merged. Meanwhile, users expect responses and can get frustrated when issues are not quickly resolved.</p><p>For open source projects with significant commercial usage, an open source maintenance fee could be a viable way to reduce this kind of noise, and generate some revenue for the time maintainers spend on them. For example, the Wix Toolset project currently has <a href="https://github.com/wixtoolset/issues/issues?ref=blog.pragmaticengineer.com"><u>775 open issues</u></a> (!) and another 6,500 <a href="https://github.com/wixtoolset/issues/issues?q=is%3Aissue%20state%3Aclosed&amp;ref=blog.pragmaticengineer.com"><u>closed ones</u></a>. Just keeping on top of the issues raised looks like a lot of work!</p><p>For projects that do not use GitHub, the barrier to entry for creating issues and pull requests is a lot higher. For example, Linux still runs on email mailing lists, and patches need to be sent via email. To contribute to Linux, a lot of upfront work is necessary first. <em>We cover more about contributing to Linux in the episode of the Pragmatic Engineer Podcast, </em><a href="https://newsletter.pragmaticengineer.com/p/how-linux-is-built-with-greg-kroah?ref=blog.pragmaticengineer.com"><em><u>How Linux is built with Greg Kroah-Hartman</u></em></a><em>.</em></p><p><strong>A fee like this could help keep open source projects as free and open source software (FOSS).</strong> Someone might argue that the existence of a fee goes against the concept of &#x201C;free&#x201D;. However, the &#x201C;free&#x201D; in FOSS stands for the freedom to:</p><ol><li>Run the program as you wish for any purpose (you can do this!)</li><li>Study how the program works and modify it as you wish (you can do this; just get the code and change it, or fork it!)</li><li>Redistribute copies to help others (you can do this)</li><li>Distribute copies of modified versions (again, you can do this)</li></ol><p>FOSS shouldn&#x2019;t mean that someone works for free in replying to issues and responding to change requests. In the case of Wix Toolset, maintainer Rob Menshing <a href="https://x.com/robmen/status/1958565865131245953?ref=blog.pragmaticengineer.com"><u>worked hard</u></a> with lawyers to ensure that this Open Source Maintainer Fee is fully compliant with FOSS principles and expectations. As Rob <a href="https://x.com/robmen/status/1958565865131245953?ref=blog.pragmaticengineer.com"><u>put it</u></a>:</p><p>&#x201C;OSS does not mean that everything is available for no cost.&#x201D;</p><p><strong>The more open source projects on GitHub are sustainable, the better it will be for the tech ecosystem. </strong>An open source project that is sustainable long-term &#x2013; with maintainers having a reasonable workload, keeping up with comments, and releasing new versions when needed &#x2013; is a lot better than one that puts up no such barriers and leads maintainers to throw in the towel and quit.</p><p>Don&#x2019;t forget, any company or individual who does not want to pay this fee but <em>does</em> want to make modifications to the code can go ahead and do it! They just need to create their fork and then can go ahead with the modification. The cost of keeping this fork up-to-date with the project then becomes theirs to pay and it could be more expensive than the $10-60/month fee asked for by the project.</p><p><em>When introducing the fee, the Wix Toolset project had a long discussion; </em><a href="https://github.com/wixtoolset/issues/issues/8974?ref=blog.pragmaticengineer.com"><em><u>read the thread</u></em></a><em> for more details of the thinking behind this move.</em></p><h3 id="enterprise-only-features-how-uv%E2%80%99s-creator-plans-to-make-money">Enterprise-only features: how uv&#x2019;s creator plans to make money</h3><p>In the Python world, the <a href="https://docs.astral.sh/uv/?ref=blog.pragmaticengineer.com"><u>uv</u></a> package manager is surging in popularity: it is ultra-fast and 10-100x speedier than the formerly popular package manager, pip. Part of this speed is thanks to uv being written from scratch, in Rust, for performance. It does lots of clever things like parallel downloading and installing of packages, and maintaining an optimized module cache. uv is so useful that an AI startup told me that moving over to this package manager improved productivity more than any AI tool they trialed. From <a href="https://newsletter.pragmaticengineer.com/p/software-engineering-with-llms-in-2025?ref=blog.pragmaticengineer.com"><u>Software engineering with LLMs in 2025, reality check</u></a>:&#xA0;</p><blockquote>&#x201C;An interesting detail emerged when I asked how they would compare the impact of AI tools to other innovations in the field. This engineer said that for their domain, the impact of the uv project manager and ruff linter has been greater than AI tools, since uv made their development experience visibly faster!<br></blockquote><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXe8Uszan5rKrkh-WsB8Edea_VkYB9TnGHWltDpAlJIZCFDh03UCIPbL-FO5VCDUQcYWtqwsEPJJ0PBmsEIvLCOKNhjy23bJYUXClC43hWszlPpWT9qfCAbRh6X-WqLgG4SfGTssKA?key=7_rfa1Xu7ngymzvtx2QcHA" class="kg-image" alt loading="lazy" width="496" height="107"><figcaption><i><em class="italic" style="white-space: pre-wrap;">uv is a lot faster than other package managers. Source: </em></i><a href="https://docs.astral.sh/uv/?ref=blog.pragmaticengineer.com"><u><i><em class="italic underline" style="white-space: pre-wrap;">uv</em></i></u></a></figcaption></figure><p>uv is built by a startup called Astral that has raised $4M in seed funding in 2023. With VC funding, Astral has to generate revenue, but how to do this with a free open source package manager?&#xA0;</p><p>We now have an answer:</p><p>Astral created a <em>private</em> package registry called <a href="https://astral.sh/pyx?ref=blog.pragmaticengineer.com"><u>pyx</u></a>, a paid-enterprise focused package registry. It&#x2019;s like uv, but with additional features for security and GPU support. Astral has signed up companies like Ramp and Intercom as customers already.</p><p><strong>This approach of charging for enterprise features is clever</strong> and great news for the open source community: if Astral can get traction for pyx, they could have a business offering standout tools for free to individual developers, and also more advanced paid versions with enterprise features for companies. We have covered the <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-64?ref=blog.pragmaticengineer.com"><u>pressure on commercial open source companies to make money</u></a>, so fingers crossed that Astral succeeds!</p><hr><p>This was an excerpt from <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-143?ref=blog.pragmaticengineer.com" rel="noreferrer">The Pulse #143</a>. The full issue additionally covers:</p><ol><li><strong>Industry pulse. </strong>Microsoft&#x2019;s US compensation bands revealed, a reader gets in touch about Forward Deployed Engineers, why OpenAI has released an open model, Canva trials AI coding tools for interviews, and more.</li><li><strong>Tailwind CSS team burnt by LLM&#x2019;s inconsistent code generation. </strong>A project estimated to take 6 weeks without AI, took 12 weeks with Claude Code.</li><li><strong>Meta and OpenAI in talent &amp; compensation war. </strong>15 years after aggressively poaching engineers from Google, Meta is doing the same to OpenAI, which is responding with six and seven-figure retainer bonuses.</li><li><strong>What people think improves AI applications vs what actually does.</strong> Instead of keeping up with the latest AI headlines, why not talk to users? Author Chip Huyen has practical tips for AI engineers.</li></ol><p> <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-143?ref=blog.pragmaticengineer.com" rel="noreferrer"><strong>Read the full issue here</strong></a></p>]]></content:encoded></item><item><title><![CDATA[New trend: extreme hours at AI startups]]></title><description><![CDATA[Pulling 80+ hour work weeks – including weekends – is becoming the norm across AI startups, and is unlikely to stop while AI is so hot.]]></description><link>https://blog.pragmaticengineer.com/new-trend-extreme-hours-at-ai-startups/</link><guid isPermaLink="false">689e0d3f66ded000013c6156</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 14 Aug 2025 16:35:32 GMT</pubDate><content:encoded><![CDATA[<p>Two months after publishing this article, The Wall Street Journal covered the same trend in its article <a href="https://www.wsj.com/tech/ai/ai-race-tech-workers-schedule-1ea9a116?ref=blog.pragmaticengineer.com" rel="noreferrer">AI workers are putting in 100-hour workweeks to win the new tech arms race</a>. If you&apos;d like to keep up-to-date with the tech industry &#x2013; and stay months ahead of mainstream media &#x2013; <a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com" rel="noreferrer">subscribe to The Pragmatic Engineer</a>.</p><hr><p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-142?ref=blog.pragmaticengineer.com" rel="noreferrer"><em><u>this week&#x2019;s The Pulse issue</u></em></a><em>. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em><u>subscribe here</u></em></a><em>.</em></p><p>&#x201C;996&#x201D; stands for &#x201C;from 9am to 9pm, 6 days a week&#x201D;, and used to be a common work pattern at Chinese tech companies until it <a href="https://www.china-briefing.com/news/996-is-ruled-illegal-understanding-chinas-changing-labor-system/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">was officially banned</a> starting from 2021. Such extreme working hours have long been rejected in the US. It is even outlawed in Europe because excessive hours tend to lead to burnout and other health issues, longer term.</p><p>Despite that, more AI startups are adopting something similar to the 996 work pattern, including Cognition, which expects staff to put in 80+ hours per week. Indeed, the CEO, Scott Wu, was unapologetic about the company&#x2019;s hardcore culture in a post he <a href="https://x.com/ScottWu46/status/1952776198947520659?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">shared</a>:</p><blockquote>&#x201C;People have asked about our culture and recent employee communications. Cognition has an extreme performance culture, and we&#x2019;re upfront about this in hiring so there are no surprises later.<br><br>We routinely are at the office through the weekend and do some of our best work late into the night. Many of us literally live where we work.<br><br>We know that people who joined Windsurf didn&#x2019;t expect to join Cognition and while we&#x2019;re proud of how we work, we understand it&#x2019;s not for everyone&#x201D;.</blockquote><p>There are several other cases of AI startups mandating grueling hours for workers:</p><ul><li><strong>Lovable</strong>: job descriptions by the company <a href="https://x.com/antonosika/status/1878525525289009643?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">detail</a> &#x201C;Long hours, high pace. Candidates must thrive under high urgency, with AGI timelines approaching.&#x201D;</li><li><strong>Replit</strong>: CEO, Amjad Masad, <a href="https://x.com/amasad/status/1884511265672028234?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">posted photos</a> of the whole team grinding in the office at midnight, preparing to ship something big.</li><li><strong>xAI:</strong> Zeeshan Pate, an engineer at Elon Musk&#x2019;s AI startup, <a href="https://x.com/zeeshanp_/status/1954838721213382700?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">shared</a> that the team was &#x201C;grinding in the office day and night&#x201D;.</li><li><strong>CodeRabbit</strong>: just today, the team was <a href="https://x.com/harjotsgill/status/1955921756780421286?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">still shipping</a> at 2:30am in the San Francisco headquarters</li><li><strong>Icon</strong> (ad maker): the founder and CEO <a href="https://x.com/kennandavison/status/1899505804677677188?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">shared</a> that the business &#x201C;only hires [the] top 0.01% engineers with no life&#x201D;, and expects them to work all week: &#x201C;why do 6 [days of work] when you can do 7&#x201D;.</li><li><strong>Google&#x2019;s AI unit</strong>: Google cofounder Sergey Brin <a href="https://nypost.com/2025/02/28/business/googles-sergey-brin-says-60-hours-per-week-in-office-is-sweet-spot-of-productivity-as-ai-race-heats-up/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">told</a> staff in the tech giant&#x2019;s AI unit: &#x201C;60 hours a week is the sweet spot of productivity&#x201D;.</li></ul><p><strong>These extreme hours are justified by a sprint to achieve Artificial General Intelligence (AGI) in just a matter of months. </strong>That&#x2019;s because plenty of AI professionals believe that when this point is reached, it will be &#x201C;game over&#x201D; for most companies in the segment, with a new, solidified, status quo in place, and AGI being able to improve itself with more resources. From then on, companies with AGI will dominate the industry. This is the incentive for using every means possible (including employees&#x2019; labor) to get to AGI &#x2013; and quickly! <em>Personally, I don&#x2019;t buy this simplistic prediction about AGI and what might happen when or if it&#x2019;s reached, but it is the driving force behind the thinking of many founders with commercial pressures.</em></p><p>It&#x2019;s nearly three years since ChatGPT was released, and there are still no signs of AGI, even though LLMs continuously improve. But what about the exhausting work patterns that are meant to be in place for a few months; could they stay in place for years, and become standard?</p><p><strong>Asking or demanding staff to put in very long hours is a recipe for making the pace of work slow down, and for individuals to burn out. </strong>We live in an economic system in which companies try to &#x201C;extract&#x201D; as much as possible from employees across all industries, so why don&#x2019;t employers in other sectors also make staff work 80-hour weeks?</p><p>In many countries, regulations mandating sensible working conditions are one reason, and unions advocate for this. Another is that the downside of long working hours soon becomes visible:</p><ul><li>Staff take more time off for sickness</li><li>Productivity drops</li><li>More employees quit for places with shorter working weeks</li></ul><p>Plus, mandating long working hours automatically excludes many strong potential candidates who&#x2026;</p><ul><li>have family duties outside of working hours</li><li>live further from the office, and for whom the commute would be too long</li><li>prioritize personal, non-work, activities outside of core hours</li></ul><p>Of course, if you hire young professionals, burnout won&#x2019;t occur so fast &#x2013; and some people can work for years like this. Plus, if they don&#x2019;t have families or a busy social life, they may be comfortable with working what looks like &#x201C;crazy&#x201D; hours.</p><p><strong>Another powerful incentive for an AI startup to create a long-hours culture: the promise of generational wealth. </strong>Consider these two questions:</p><ol><li>Would you work 6 days a week at a startup, and pull 80+ hours per week indefinitely, for <em>less</em> than you&#x2019;d earn at a company with a 40-hour work week?</li><li>Would you work 6 days a week and put in 80+ hour weeks for 1-3 years, then walk away with $10M in compensation?</li></ol><p>The answer to #1 is likely &#x201C;no&#x201D;, but the answer to #2 is &#x201C;obviously yes&#x201D; for most people! And the incredible growth in the AI industry means that #2 seems achievable to many. Take Windsurf: just 10 months after launching the Windsurf IDE, 40 employees from the team were acquired by Google. The founders likely made hundreds of millions, and some engineers may have made $10M+ in compensation! That&#x2019;s not bad for a few years&#x2019; work!</p><p>Now, put yourself in the shoes of founders who are set to make not $10M, but a multiple of that. For them, it was <em>absolutely</em> worth putting in the long hours and pushing their team to do the same.</p><p>Based on that, we can expect founders to keep pushing staff to spend every waking moment at work, or thinking about work.</p><p><strong>While there&apos;s the promise of making it big with AI startups, these working hours will likely stay.</strong> Within AI startups, founders face pressure to ship fast or be out-executed by rivals. Speed is essential to win in AI, time-to-market is essential, and the most obvious way to attempt to get things done faster is to push people to work more.</p><p>For the importance of time to market: just look at startups like Magic.dev. A year ago, it raised $515M in funding and <a href="https://magic.dev/blog/100m-token-context-windows?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">claimed</a> it could support 100M token context windows, at a time when most models could not even support 100K. That meant Magic&#x2019;s model would be a 1,000x improvement on mainstream LLMs! However, a year later and Google&#x2019;s Gemini already <a href="https://developers.googleblog.com/en/new-features-for-the-gemini-api-and-google-ai-studio/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">supports</a> 2M tokens, and Claude <a href="https://www.anthropic.com/news/1m-context?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">added</a> support for 1M tokens. So Magic&#x2019;s lead is cut to 50-100x &#x2013; which is still considerable, but much reduced.</p><p>At the same time, there have not been updates from Magic, and this 100M context window model is not publicly available for use. If the company does not ship a public-facing product soon, they could see mainstream LLMs catch up in context window length, and thereby lose most of their potential customer base in the dev market.</p><p>Still, right now there&#x2019;s a massive business opportunity to make lots of money with AI products, and for early employees to create generational wealth via generous equity &#x2013; <em>if </em>their startup executes well, and is later acquired for a huge sum.</p><p><strong>But long hours alone don&#x2019;t guarantee success, and there are signs of this. </strong>Cognition is proud of its &#x201C;extreme performance culture&#x201D;, but its office culture hasn&#x2019;t quite led to business success: Devin was one of the least-referenced AI tools in <a href="https://newsletter.pragmaticengineer.com/p/the-pragmatic-engineer-2025-survey?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">The Pragmatic Engineer 2025 survey</a>.</p><p>Another sign that grueling hours don&#x2019;t automatically generate success is the ad maker, Icon. Despite demanding 7-day working weeks from staff, the company seems to be pivoting to <a href="https://x.com/JakubSzunyogh/status/1950912600713171090?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">become just another advertising agency</a>, offering to create unlimited versions of ads for $1,000/month. Ad agencies are pretty good lifestyle businesses, but rarely the kind of high-growth ventures valued in the billions!</p><p>I expect that things will <em>eventually</em> return to normal in a few years&apos; time, when every tech company will also be an &#x201C;AI company&#x201D; and it will be business as usual. But for now, the fear of missing out (FOMO) is so strong across the industry that very long workweeks look set to spread.</p><p><strong>Everyone&#x2019;s situation is different, so figure out if you want or need to work extreme hours. </strong>But<strong> </strong>there will be more of a push from AI startups to weed out applicants who are resistant to very long work weeks. Accepting such a position can be an amazing career boost; yes, long hours have many downsides, but you &#x201C;gel&#x201D; better with colleagues, and relationships forged under pressure can last decades. Of course, all the stress can also lead to burnout and health issues.</p><p>In the end, just know that this trend is real and likely to stick around. If your workplace employs a more &#x201C;normal&#x201D; work pattern, it&#x2019;s worth knowing that this is not necessarily something you can take for granted across tech!</p><p><strong>Long hours are not a new thing in tech, and could be just be as much about excitement about LLMs.</strong> A now retired software engineer told me how the above summary brought back memories to her of doing pretty much the same, back 50 years ago:</p><blockquote>&#x201C;Thank you for that brief trip down memory lane. Of course this, current generation is working exhausting hours, this technology is new and exciting.<br><br>My generation did the same thing with computers back in the 1970s....<br><br>Storing cases of beer in the cooling system beneath the raised tiles.<br><br>Eating burgers at 7:00 am having pulled an all nighter after system crashes.<br><br>I suppose it is all new to you youngers!&#x201D;</blockquote><p><em>This was one out of four topics from this week&apos;s The Pulse issue. </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-142?ref=blog.pragmaticengineer.com" rel="noreferrer"><em>Read the full article here.</em></a></p><p></p><h2 id></h2>]]></content:encoded></item><item><title><![CDATA[Cursor makes developers less effective?]]></title><description><![CDATA[A study into the workflows of experienced developers found that devs who use Cursor for bugfixes are around 19% slower than devs who use no AI tools at all. One possible takeaway is that AI tools can be harder work than we’re led to believe.]]></description><link>https://blog.pragmaticengineer.com/cursor-makes-developers-less-effective/</link><guid isPermaLink="false">68827513b0013900017c5097</guid><dc:creator><![CDATA[Gergely Orosz]]></dc:creator><pubDate>Thu, 24 Jul 2025 18:09:09 GMT</pubDate><content:encoded><![CDATA[<p><em>Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover Big Tech and startups through the lens of senior engineers and engineering leaders. Today, we cover one out of four topics from </em><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-140?ref=blog.pragmaticengineer.com"><em><u>last week&#x2019;s The Pulse</u></em></a><em> issue. Full subscribers received the below article seven days ago. To get articles like this in your inbox, every week, </em><a href="https://newsletter.pragmaticengineer.com/about?ref=blog.pragmaticengineer.com"><em><u>subscribe here</u></em></a><em>.</em></p><p><em>Many subscribers expense this newsletter to their learning and development budget. If you have such a budget, here&#x2019;s</em><a href="https://blog.pragmaticengineer.com/request-to-expense-the-pragmatic-engineer-newsletter/"><em><u> an email you could send to your manager</u></em></a><em>.</em></p><hr><p>An interesting study has been published by the nonprofit org, Model Evaluation and Threat Research (METR). They recruited 16 experienced developers who worked on large open source repositories, to fix 136 real issues, for pay of $150/hour. Some devs were assigned AI tools to use, and others were not. The study recorded devs&#x2019; screens, and then examined and analyzed 146 hours of footage. The <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">takeaway</a>:</p><blockquote>&#x201C;<strong>Surprisingly, we find that when developers use AI tools, they take 19% longer than without. AI makes them slower</strong>. (...) This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.&#x201D;</blockquote><p>This result is <em>very</em> surprising! But what is going on? Looking closely at the <a href="https://arxiv.org/abs/2507.09089?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">research paper</a>:</p><p>The research is about Cursor&#x2019;s impact on developer productivity. The AI tool of choice for pretty much all participants was Cursor, using Sonnet 3.5 or 3.7. A total of 44% of developers had never used Cursor before, and most others had used it for up to 50 hours.</p><p><strong>Those using AI spent less time on coding to complete the work &#x2013; but took more time, overall. </strong>They also spent less time on researching and testing. But they took longer on promoting, waiting on the AI, reviewing its output, and on &#x201C;IDE overhead&#x201D;, than those not using AI. In the end, additional time spent with the AI wiped out the time it saved on coding, research, and testing, the study found.</p><p>It&#x2019;s worth pointing out that this finding applies to all AI tools, and not only to Cursor, which just happens to be the tool chosen for this study.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/07/image-1.png" class="kg-image" alt loading="lazy" width="1456" height="760" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/07/image-1.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/07/image-1.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/07/image-1.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Using AI meant less time spent coding, but the work took longer, overall. Source: </em></i><a href="https://arxiv.org/pdf/2507.09089?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">METR</em></i></a></figcaption></figure><p><strong>Developers are over optimistic in their estimates about AI&#x2019;s productivity impact </strong>&#x2013; initially, at least.<strong> </strong>From the survey:</p><blockquote>&#x201C;Both experts and developers drastically overestimate the usefulness of AI on developer productivity, even after they have spent many hours using the tools. This underscores the importance of conducting field experiments with robust outcome measures, compared to relying solely on expert forecasts or developer surveys.&#x201D;</blockquote><p><strong>The one dev who had used Cursor for 50+ hours saw a <em>lot</em> of speedup! </strong>In the study, there was a single developer who had used Cursor for a total of more than 50 hours, previously. This dev saw a very impressive 38% increase in speed. Then again, a sample size of one is not very representative of a group of 16:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://blog.pragmaticengineer.com/content/images/2025/07/image-2.png" class="kg-image" alt loading="lazy" width="1456" height="767" srcset="https://blog.pragmaticengineer.com/content/images/size/w600/2025/07/image-2.png 600w, https://blog.pragmaticengineer.com/content/images/size/w1000/2025/07/image-2.png 1000w, https://blog.pragmaticengineer.com/content/images/2025/07/image-2.png 1456w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">One developer with 50+ hours of experience on Cursor completed work much faster. Source: </em></i><a href="https://arxiv.org/pdf/2507.09089?ref=blog.pragmaticengineer.com" target="_blank" rel="noopener noreferrer nofollow"><i><em class="italic" style="white-space: pre-wrap;">METR</em></i></a></figcaption></figure><p>Software engineer Simon Willison &#x2013; whom I consider an unbiased expert on AI dev tools &#x2013; <a href="https://news.ycombinator.com/item?id=44523442&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">interprets</a> the survey like this:</p><blockquote>&#x201C;My intuition here is that this study mainly demonstrated that the learning curve of AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learning curve.&#x201D;</blockquote><p>Indeed, he made a similar point <a href="https://newsletter.pragmaticengineer.com/p/ai-tools-for-software-engineers-simon-willison?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">on an episode of the Pragmatic Engineer podcast</a>: &#x201C;you have to put in so much effort to learn, to explore and experiment, and learn how to use it. And there&apos;s no guidance.&#x201D;</p><p>In <a href="https://newsletter.pragmaticengineer.com/p/ai-tooling-2024?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">research on AI tools by this publication</a>, based on input from circa 200 software engineers, we found supporting evidence of that: those who hadn&#x2019;t used AI tools for longer than 6 months were more likely to have a negative perception of them. Very common feedback from engineers who didn&#x2019;t use AI tooling was that they&#x2019;d tried it, but it didn&#x2019;t meet expectations, so they stopped.</p><p><strong>The engineer who saw a 38% &#x201C;speed-up&#x201D; versus non-AI devs has an interesting take. </strong>That lone engineer with 50+ hours of Cursor experience is PhD student, <a href="https://scholar.google.com/citations?user=GDm6BIAAAAAJ&amp;hl=en&amp;ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">Quentin Anthony</a>. Here&#x2019;s what he <a href="https://x.com/QuentinAnthon15/status/1943948791775998069?ref=blog.pragmaticengineer.com" rel="noopener noreferrer nofollow">says</a> about the study, and how AI tools impact developer efficiency:</p><blockquote>&#x201C;<strong>1. AI speedup is very weakly correlated to anyone&apos;s ability as a dev.</strong> All the devs in this study are very good. I think it has more to do with falling into failure modes, both in the LLM&apos;s ability and the human&apos;s workflow. I work with a ton of amazing pretraining devs, and I think people face many of the same problems.<br><br>We like to say that LLMs are tools, but treat them more like a magic bullet.<br><br>Literally any dev can attest to the satisfaction of finally debugging a thorny issue. LLMs are a big dopamine shortcut button that may one-shot your problem. Do you keep pressing the button that has a 1% chance of fixing everything? It&apos;s a lot more enjoyable than the grueling alternative, at least to me.<br><br><strong>2. LLMs today have super spiky capability distributions.</strong> I think this has more to do with:what coding tasks we have lots of clean data forwhat benchmarks/evals LLM labs are using to measure success.<br><br>As an example, LLMs are all horrible at low-level systems code (GPU kernels, parallelism/communication, etc). This is because their code data is relatively rare, and evaluating model capabilities is hard (I discuss this in more detail <a href="https://github.com/Quentin-Anthony/torch-profiling-tutorial?tab=readme-ov-file&amp;ref=blog.pragmaticengineer.com#footnote-my-own-opinionated-take-on-measuring-throughput" rel="noopener noreferrer nofollow">here</a>).<br><br>Since these tasks are a large part of what I do as a pretraining dev, I know what parts of my work are amenable to LLMs (writing tests, understanding unfamiliar code, etc) and which are not (writing kernels, understanding communication synchronization semantics, etc). I only use LLMs when I know they can reliably handle the task.<br><br>When determining whether some new task is amenable to an LLM, I try to aggressively time-box my time working with the LLM so that I don&apos;t go down a rabbit hole. Again, tearing yourself away from an LLM when &quot;it&apos;s just so close!&quot; is hard!<br><br><strong>3. It&apos;s super easy to get distracted in the downtime while LLMs are generating. </strong>The social media attention economy is brutal, and I think people spend 30 mins scrolling while &quot;waiting&quot; for their 30-second generation.<br><br>All I can say on this one is that we should know our own pitfalls and try to fill LLM-generation time productively:If the task requires high-focus, spend this time either working on a subtask, or thinking about follow-up questions. Even if the model one-shots my question, what else don&apos;t I understand?If the task requires low-focus, do another small task in the meantime (respond to email/slack, read or edit another paragraph, etc).<br><br>As always, small digital hygiene steps help with this (website blockers, phone on dnd, etc). Sorry to be a grampy, but it works for me :)&#x201D;</blockquote><p>Quentin concludes:</p><blockquote>&#x201C;LLMs are a tool, and we need to start learning its pitfalls and have some self-awareness. A big reason people enjoy Andrej Karpathy&apos;s talks is because he&apos;s a highly introspective LLM user, which he arrived at a bit early due to his involvement in pretraining some of them.<br><br>If we expect to use this new tool well, we need to understand its (and our own!) shortcomings and adapt to them.&#x201D;</blockquote><p><strong>I wonder if context switching could become the Achilles Heel of AI coding tools. </strong>As a dev, the most productive work I do is when I&#x2019;m in &#x201C;the zone&#x201D;, just locked into a problem with no distractions, and when my sole focus is work! I know how expensive it is to get back into the zone after you fall out of it.</p><p>But I cannot stay in the zone when using a time-saving AI coding tool; I need to do something else while code is being generated, so context switches are forced, and each one slows me down. It&#x2019;s a distraction.</p><p>What if the constraint of being &#x201C;in the zone&#x201D; when writing code is a feature, not a bug? And what if <em>experienced</em> devs not using AI tools outperform most others with AI because they consciously stay in &#x201C;the zone&#x201D; and focus more? Could those without AI tools have been &#x201C;in the zone&#x201D; and working at a higher performance level than devs forced into repeated context switches by their AI tools?</p><p>There&#x2019;s food for thought here about how time saved on coding doesn&#x2019;t automatically translate into higher productivity when building software.</p><hr><p>This was one out of four topics from last week&apos;s The Pulse. <a href="https://newsletter.pragmaticengineer.com/p/the-pulse-140?ref=blog.pragmaticengineer.com" rel="noreferrer">The full issue</a> also covers:</p><ol><li><strong>Industry pulse. </strong>Why 1.1.1.1 went down for an hour, Microsoft cut jobs to buy more GPUs, Meta&#x2019;s incredible AI data center spend, and the &#x201C;industry-wide problem&#x201D; of fake job candidates from North Korea.</li><li><strong>Windsurf sale: a complicated story of OpenAI, Microsoft, Google, and Cognition. </strong>OpenAI wanted to buy Windsurf but couldn&#x2019;t because of Microsoft. Google then hired the founders and core team of Windsurf, and Cognition (the maker of Devin) bought the rest of the company. It&#x2019;s a weird story that could not happen outside of California &#x2013; thanks to California having a ban on noncompetes.</li><li><strong>Beginning of the end for VC-subsidized tokens? </strong>Cursor angered devs by silently imposing limits on its &#x201C;unlimited&#x201D; tier. Us devs face the reality that LLM usage is getting more expensive &#x2013; and VC funding will probably stop subsidizing the real cost of tokens.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/p/the-pulse-141?ref=blog.pragmaticengineer.com" rel="noreferrer">This week&apos;s The Pulse issue</a> &#x2013; sent out to full subscribers &#x2013; covers:</p><ol><li><strong>Mystery solved about the cause of June 10th outages.&#xA0;</strong>Heroku went down for a day due to an update to the systemd process on Ubuntu Linux. Turns out that dozens of other companies including OpenAI, Zapier, and GitLab, were also hit by the same issue, with outages of up to 6 hours.</li><li><strong>Replit AI secretly deletes prod &#x2013; oops!&#xA0;</strong>Cautionary tale of why vibe-coding apps are not yet production-ready, which makes it hard to foresee production-hardened apps being shipped with no software engineering expertise involved.</li><li><strong>Industry pulse.&#xA0;</strong>Fresh details about the Windsurf sale, Zed editor allows all AI functionality to be turned off, government agencies using Microsoft Sharepoint hacked, GitHub releases vibe coding tool, and more.</li><li><strong>Reflections on a year at OpenAI.&#xA0;</strong>Software engineer Calvin French-Owen summarized his impressions of OpenAI, sharing how the company runs on Slack and Azure, capacity planning challenges for OpenAI Codex launch, learnings from working on a large Python codebase, and more.</li></ol><p><a href="https://newsletter.pragmaticengineer.com/s/the-pulse?ref=blog.pragmaticengineer.com" rel="noreferrer"><strong>Read The Pulse issues here.</strong></a></p>]]></content:encoded></item></channel></rss>