tecosystems

The State of Open Source Licensing in 2026

Stephen O'Grady — Wed, 25 Mar 2026 14:59:03 +0000

As far back as 2012, a RedMonk colleague was asserting that we were living in a “post-open source world,” meaning a world in which open source had been so successful that the very things that underpinned it – open source software licenses, for one – were taken for granted and thus, ignored. This hypothesis was supported by various datapoints ranging from GitHub’s acknowledgement that less than 20% of the repositories it hosted carried open source licenses to the repeated poor behavior of large companies that should know better brazenly and cavalierly misusing the term open source.

That unfortunate apathy notwithstanding, assertions that open source “doesn’t really matter” have never been any more defensible than saying climate change doesn’t matter because the average citizen pays it little mind. While the industry is undergoing tectonic shifts due to the incredible advances in AI, and there are multiple lines of potentially intractable questions about how open source licenses apply in this new era, the fact is that open source software is still being produced, and licensing decisions made around them are still being made strategically – see Swamp’s license choice, as but one example.

All of which means that it’s important to periodically take stock of the open source licensing landscape, to step back and evaluate its trends and tease apart what those might suggest about our industry and the choices its making moving forward. One of the last times we did this was in 2017, when the best available data was from Black Duck.

To bring that up to date, we’ve compared and contrasted historical data from sources no longer extant (Black Duck), sources that are still available but have undergone significant, disruptive transitions in the dataset (the GitHub Archive), and current, active sources (deps.dev – a superset of software package repositories). A few caveats before proceeding:

As noted above, in the largest single source of this data, the overwhelming majority of projects – likely 80% or more – do not carry licenses and are thus not represented in this analysis. Given that the cost of producing software is approaching zero thanks to step function changes in code assist abilities, this delta will likely only increase.
There is no single source of truth for the industry. Some datasets as mentioned are no longer available. Others like the GitHub Archive have undergone changes over the years making consistent measure hard to impossible. And none of them have insight into the full spectrum of software written and relied upon behind enterprise firewalls.
Much like with RedMonk’s Programming Language Rankings, then, this analysis should be considered an evaluation of the data that’s available rather than a full fidelity representation of the licensing reality.

Those limitations acknowledged, there are nevertheless some interesting takeaways from the data that is available. Here are the key themes worth noting for open source licensing in 2026.

The Rise of Permissive Licensing

Noted here well over a decade ago, the industry has been amidst a long term shift away from more restrictive, copyleft-style licenses to more permissive alternatives. The precise ratio of copyleft to permissive licenses has depended on the year and the particular sample surveyed, but they all have shown a marked shift away from copyleft licenses.

It’s difficult to pinpoint the exact point in time that we moved from a copyleft majority licensing environment to a permissive one, but a reasonable estimate is that the industry unknowingly crossed that chasm at some point between 2014 and 2017.

To some degree, this was inevitable, as copyleft’s absolute dominance from a licensing standpoint can be appropriately understood as being at least in part an artifact of the overwhelming popularity of copyleft projects such as Linux and MySQL. Greater license diversity was always likely, if not inevitable.

One interesting question is whether, as so often happens in industry, the pendulum will begin to swing back towards copyleft. With the caveat that there are odd small sample size issues with recent GitHub Archive data, there is some evidence to suggest this. Permissive licenses have ticked down from a high of 82% in 2022 to 73% in 2025. Again, this could simply be a sampling issue – the packaging data from deps.dev indicates no similar shift – but it’s worth watching if only because of the recent return from a few major projects to the AGPLv3 in particular which will be discussed more momentarily.

License Distribution by Package Ecosystem

Before looking at license traction as a whole, it’s worth examining community-level differences and drivers. This is based on data extracted from deps.dev, and looks at the license distribution across seven major package repositories.

As is evident, these repositories – like the data more broadly – tend to skew permissive. There are notable issues with certain repositories – in NuGET, the .NET package repository, more than half the packages have licenses that don’t map to SPDX identifiers and are thus unclassifiable. The npm package repository, meanwhile, skews heavily towards the ISC license as npm init historically defaulted to it. Maven’s Java focus, meanwhile, left it solidly in Apache’s orbit.

It’s worth noting, however, that these packages generally reflect deployed code, and as will be shown momentarily, the deployed code shows higher rates of permissive license usage than copyleft. That being said, also as we’ll come back to, npm’s weight in this sample – being roughly 3X more than the other repos combined – is overrepresented.

Apache vs MIT

The two primary beneficiaries of the rise of permissive licensing have been the Apache Software License and the MIT. In part because of its patent termination provisions that attempt to minimize the possibility of patent infringement litigation, Apache has tended to be the preference for commercial open source usage versus the MIT license which does not mention patents at all.

Both licenses dramatically improved their share of usage over the past two decades, but following the creation of the CNCF which favored Apache licenses in 2015 and the introduction of popular Apache licensed projects like TensorFlow in the same year and PyTorch the year after in 2016, Apache’s share of distribution began climbing in 2017 to a peak in this dataset of around 30% in 2022. This corresponded with a slight but noticeable decline in MIT’s share over that time, suggesting that the growth of the one came at least indirectly at the expense of the other.

In 2023, however, this trend reversed in dramatic fashion, and Apache usage dropped dramatically while MIT spiked. In all likelihood, this is an artifact of oddly smaller sample sizes post-2022, culminating in a dramatically smaller 2025 sample RedMonk is still seeking explanations for as we perform our regular biannual programming language ranking.

It’s also possible that it’s in part a reflection of the aforementioned outsized significance of JavaScript as a language and npm as a repository, as the latter favors MIT and hosts by far the most packages. Which brings us in turn to:

The Packaging Filter

One interesting exercise to consider is what deltas if any there might be between hosted source code and packaged code about to be deployed. In general, the differences tend to be minimal, but one obvious distinction is in the aforementioned outperformance of ISC. A historical default for npm, it is dramatically over-represented in the deps.dev dataset, being 31X more common there than on GitHub – a natural consequence of npm’s immense presence.

Both GPL licenses, for their part, are much more common on GitHub – 34X more common in the case of version 2 – than in deps.dev, which is predictable given the latter’s overwhelming preference for permissive rather than than copyleft licenses.

As a side note, while reading this chart: the “Unlicense” above is not a typo for “Unlicensed,” it’s an actual license that says, essentially, users can do whatever they want they want with the code. It’s more or less public domain for source code.

Source Available Licenses

One last question when surveying the state of licensing in 2026 is the degree to which non-open source, source available or hybrid licenses such as the BSL or SSPL are in circulation. The short answer to this is that these licenses are not measurable in a statistically significant way. They remain extremely uncommon and are not trending.

While they are not relevant statistically, however, they remain for the time being strategically relevant because of the projects that carry them (e.g. MongoDB, Terraform, etc). While there have been notable returns to open source licenses from source available alternatives – Elastic and Redis, for example, returned to an OSI approved license in the AGPL – the long term future of source available licensing will not be determined by a quantitative analysis such as this.

Besieged

Stephen O'Grady — Tue, 10 Feb 2026 15:39:00 +0000

As the last digit on the calendar rolled over from five to six, it took less than a month to realize the coming year was going to be different than the year that preceded it. Arguably the stage was set late last year with the November “inflection point” but with open source AI projects becoming so popular overnight they caused runs on hardware and meaningfully moved the share price of public companies, 2026 is an unambiguously and unapologetically new world.

It can be difficult to recall now, but five years ago when Copilot made its debut, capabilities that now seem basic were mind blowing. Much as the iPhone we now take for granted was once an earth shattering technical achievement, the jumped up autocomplete that was the initial coding assistant tool gave way to models that progressed at shocking rates with capabilities that broadened just as quickly. Early, confident predictions that coding assistants were merely for scaffolding while actually creative code would always be the purview of humans were, in a word, wrong. We’re now living in a world in which a growing number of legitimate developers are discussing and in many cases shipping code that has never been reviewed by a human.

In 2026, coding assistance agents – the software manifestations of coding assistance models – are extraordinarily capable, and only growing more so by the day. Attitudes towards them have therefore been forced to evolve. There is still a wide spectrum of viewpoints on AI, of course, ranging from they’re useless and evil to they’re gods among us.

The baseline assumption from here though is that as of 2026, agents are real and capable of tasks we could not have foreseen even a year or two ago. Capable to the point that they are changing how software development is done, almost certainly permanently.

As Adam Jacob said about using these tools:

If you’ve been reading what I write, it’s not like I’ve been a believer the whole time. But I am today. Because I’m doing it. It’s amazing. We will never go back, as an industry. We will simply use this capability and catapult forward.

The step change in functional ability from the agents of 2021 to the agents of 2026 is worth taking a step back to appreciate, because it represents nothing less than a siege. Or more accurately, multiple, ongoing sieges. Here is a non-exhaustive look at a few of the impacted targets.

Individuals

Long promised to reduce or even eliminate work, recent research from the Harvard Business Review argues that it does the opposite. This assertion, notably, was quickly seconded by everyone from the industry’s most comprehensive AI researcher to one of its best psychologists. While the article doesn’t cite Jevon’s Paradox, it might well have. As AI has made its capabilities more efficient to use, consumption and resource demands have risen as the theory predicts. The import of which is that far from transitioning into an environment in which working hours are reduced due to time saving AI tools, workers instead have, if anything, taken advantage of the time saved not by taking time off but by taking more work on. That is going to require a societal-level adjustment and recalibration.

In specific domains, such as within the narrower scope of developers, AI has had a similarly outsized impact as they are being forced to rethink their role in the grand scheme of things. One of the best working analogies is home construction. Historically, developers have been builders: framing out walls, cutting stringers for stairs and so on. Today, many developers see themselves as more akin to architects, not framing the walls but deciding where they get placed, not stringing the stairs but deciding how high they go. For some, this is enormously empowering. Others are experiencing a profound sense of loss, as the uniqueness and inaccessibility of their skillset is eroded. As one Tweet put it:

I don’t know why this week became the tipping point, but nearly every software engineer I’ve talked to is experiencing some degree of mental health crisis.

Maybe there is no better evidence of a siege than Evan Ratliff’s Shell Game podcast, however. In it, the journalist unleashes a sea of AI bots trained on his own voice and to impersonate him, leveraging the same techniques that scammers and spammers are adopting to attack us. Opportunity and threat are on equal display as we’re besieged by AI clamoring for our attention while we feel pressure to use AI for our own ends, whatever those might be.

Communities

Communities and more particularly open source communities are grappling, meanwhile, with the inevitable implications of AI and its lowered friction to code creation and an inevitably higher volume of traffic from it. Projects are flooded with requests, contributions and issues generated by AI systems, some of which are legitimate and useful, most of which are not. Which is not too different that normal OSS project inputs except in their increase (Mitchell Hashimoto estimates it’s a 10X difference).

This has led some projects like Ghostty above to limit AI-driven contributions, up to and including a ban on would be contributors that don’t respect the policy. Others like Liz Fong-Jones and Adam have considered the possibility of eliminating external contributions entirely. Mitchell has tried to implement a less drastic approach by systematically restricting who can contribute via the Vouch project. For her part, however, Angie Jones argues that such policies are overkill, and instead it’s incumbent on OSS projects to prepare and provide a path for responsible AI-driven contributions.

In any event, there’s little debate that communities are under siege.

Applications

As are applications. Specifically, they’ve been hammered by public markets convinced AI has made them irrelevant. The essence of the trend is summed up by the headline, “‘Get Me Out’: Traders Dump Software Stocks as AI Fears Erupt.” The drivers of this panic are myriad, but most ultimately boil down to the same issue: if code becomes fungible, what are companies that sell code – i.e. software vendors – actually worth?

This whole line of thinking isn’t new. For example, in comments on a podcast in December of 2024, Satya Nadella said:

The notion that business applications exist, that’s probably where they’ll all collapse, right in the agent era.

His actual argument was more nuanced than the “SaaS is Dead” headlines made it seem, but the core hypothesis was clearly and unambiguously bearish for SaaS vendors. An argument that many of today’s sellers of SaaS stocks would understand and agree with, and one that makes sense if you believe that SaaS vendors are primarily selling software. Anyone who has spent any time as a systems integrator, however, would almost certainly argue that software is just part of what is being sold, and in many cases a small part. A few examples:

As others have observed, if you’re buying HR software, you’re also buying domain expertise – and arguably more importantly, liability mitigation – across the globe. Same with accounting, CRM, ERP and more. The app that is built from software is not the real challenge.
That point, as noted, is reasonably well understood and articulated. Less mentioned is the talent pool. If you run packaged applications like Salesforce or Workday, you can hire experienced resources to administer and use that software. If you’ve built your own, as many financial institutions have discovered after building their own internal developer platforms rather than using platforms such as Cloud Foundry or Open Shift, your new hire’s first day will also be their first with your software. That makes hiring more challenging and onboarding and ramp up less efficient, which implies that the operational benefits have to be extensive to offset the HR costs.
Speaking of operations, one of the questions facing those who would replace off the shelf alternatives hasn’t changed in spite of AI’s dramatic reduction in development time: is investing in non-differentiating software worth the opportunity cost that could be spent on software that is differentiating? Is an organization better off recreating a CRM system, in other words, or creating something new for their organization that doesn’t exist? It’s a complicated equation with many variables, obviously, not least of which is the cost of SaaS applications. But on balance, it’s also self-evidently not a simple win for AI.

While the enterprise application market may be besieged, then, it seems just as likely AI is more likely to settle into an Amdahl mug role than blow it up entirely. Investors, however, are currently seeing it differently.

Infrastructure

As discussed previously, Gas Town (and now Claude Code, natively) mean that one developer can now magically become 10-20 virtual developers. We know from our experience with open source communities that projects are absolutely not equipped to handle that increased scale. The next question is, is our developer infrastructure?

As it turns out, the answer is no. Our infrastructure is not prepared for that.

Witness, for example, this open letter from eleven open source foundations or package repositories. It documents the “Tragedy of the Commons” problem typical of open source infrastructure, and then goes on to blame AI for making it worse:

The rise of Generative and Agentic AI is driving a further explosion of machine-driven, often wasteful automated usage, compounding the existing challenges.

What was already a problematic situation, in other words, has been made more challenging by the sea of agents currently arriving at their gates.

Economics

Arguing that public markets have been besieged by AI isn’t particularly challenging. Consider the massive capital investments currently being poured into AI related infrastructure, over the rising objections of investors losing patience. Or the fact that AI is massively overrepresented in public markets broadly. And that’s without even getting into the Three-card Monte math of some of the investments in the space. Objectively the industry is in a bubble, and bubbles have only one fate.

But even on a micro, individual scale, the economics are starting to pinch, and that is likely to get worse before it gets better. And to judge by industry chatter and recent vendor briefings, that will be happening soon. For all of the abilities of tools like code assistance, the market realities are beginning to hit home.

This process arguably began this past summer when, in an attempt to control costs, Cursor adjusted its pricing and faced a wide scale backlash. From the conversations RedMonk has been having this year, there’s more of this coming. Companies that focused strictly on capabilities – “free during preview,” expenses be damned, are now facing something of a reckoning.

The economics, meanwhile, are equally problematic for individuals. Much as households are facing multiple bills for different streaming services from Disney to Netflix, many developers feel compelled to subscribe to higher cost models, or even multiple high cost models. Case in point is this developer who was repeatedly locked out because he was consuming $2600 worth of tokens per month; he managed to get it down to ~$100, which incidentally is $100 per month more than developers would have spent on their tooling in the pre-AI world. Here, meanwhile, is someone in management spending $200 per month and budgeting $1-$2K per month per dev on their team. A developer in a local Slack went even further, reacting just this week to a Software Factory post by saying:

The $1k/day/person number jumped out at me, but I suspect that’s going to sound quaint before too long.

AI is a different world, and a much higher cost one at that.

Conclusions

The above, as mentioned, are just a few examples of impacts to this industry. The real world implications are much broader, hence the anxiety, apprehension and fear associated with increased use of AI. Understandably so.

Is the ongoing AI siege all bad, though? Is this likely to end as medieval sieges did – poorly?

First, it’s worth pointing out that new developments in automation, however, are rarely linear or entirely predictable. This chart of bank teller employment pre- and post-ATM introduction from Dr. James Bessen would have been very counter-intuitive at the time. It is only in retrospect that it’s easy to see that with the introduction of new ATM fees and automating mundane, low value tasks like cash dispensing would allow banks to open many more branches, thereby boosting overall employment for a role whose putative function had been automated out of existence.

Perhaps more importantly, however, for all of their costs, these tools are, or can be, powerful accelerants and enablers for people that dramatically lower the barriers to software development. They have the ability to democratize access to skills that used to be very difficult, or even possible for some, to acquire. Even a legend of the industry like Grady Booch, who has been appropriately dismissive of AGI claims and is actively disdainful of AI slop posted recently that he was “gobsmacked” by Claude’s abilities. Booch’s advice to developers alarmed by AI on Oxide’s podcast last week? “Be calm” and “take a deep breath.” From his perspective, having watched and shaped the evolution of the technology first hand over a period of decades, AI is just another step in the industry’s long history of abstractions, and one that will open new doors for the industry.

Lastly, whether one wants those doors opened or not ultimately is irrelevant. AI isn’t going away any more than the automated loom, steam engines or nuclear reactors did. For better or for worse, the technology is here for good. What’s left to decide is how we best maximize its benefits while mitigating its costs. AI is the epitome of “two things can be true.” On the one hand, the economics of AI are likely to get ugly in the near term and as for digesting these tools, as the conclusion of the quote from Adam above that was withheld put it, “It’s going to be an absolute mess while we sort it out.”

On the other, much like the internet before it, the technology has crossed a threshold from “intriguing toy” to “world changing evolutionary wave.” This industry will never be the same.

How well and efficiently it and the society around it decides to balance the costs and benefits, however, will determine how long the siege will carry on, and what’s on the other side.

Disclosure: GitHub (Copilot), Oxide, Red Hat (Open Shift) and Salesforce are RedMonk customers. Anthropic (Claude) and Workday are not currently customers.

The Blood Dimmed Tide of Agents

Stephen O'Grady — Thu, 08 Jan 2026 15:04:30 +0000

Many years ago, a large European bank spoke to analysts after its first transition from purely physical hardware into virtual machines. While expressing overall satisfaction with the move, its enthusiasm was clearly tempered. When pressed on the hesitant endorsement, the bank’s representative stated that virtual machines had, in fact, accomplished all of the desired goals: their infrastructure and usage were far denser, utilization was up, stand up and instantiation times were down and so on.

All of which led to a natural response: “it sounds like there’s a ‘but’ coming.”

To which the executive replied, “virtual machines have been good for us. It’s just that going from managing 500 physical machines to 5,000 virtual machines has been…a challenge.”

This pattern, in which a given unit of technology is decomposed into smaller units of a technology, is one that has played out repeatedly over the last few decades in this industry. One of the more recent examples of this was the trend towards microservices, in which larger APIs were deconstructed into multiple smaller component services in search of more granular control and development velocity.

In every case, this leads to a management challenge. What makes sense on paper and may indeed make sense in practice does not come without an offsetting cost. Just as there an array of benefits to virtual machines and its successor, containers, or to microservices, it’s important to consider not just the short term advantages to decomposition but the longer term implications of it on a going forward basis. How do you manage thousands of VMs when your existing infrastructure was designed to manage hundreds or even dozens of physical machines? How does your network cope with an exponential increase in the number of active services and endpoints?

These questions seem particularly important in 2026. Last year was in effect the year of the agent. The industry had begun to digest the abilities – and shortcomings, importantly – of existing models and related technologies, and had followed the typical industry progression from a given monolith – all encompassing AI models – to its logical conclusion, decomposed, individual AI services, or agents, that were the functional equivalent of a microservice. Small, independent and with some degree of autonomy, what ultimately came to be described as the “agentic” vision of AI was one describing fleets of individual AI agents operating in concert with one another and various third parties both human and otherwise.

All of which means that the next challenge in front of the AI market is management.

This is already evident to an extent in the domain of code assistance. As code assist progressed from enhanced autocomplete to the autonomous generation that is vibe coding, developers have gradually transitioned from builder to architect. At which point, people began to ask a logical question: if agents are as easily spun up as VMs could be once upon a time, what would happen if we added more builders?

Rather than one agent building code, then, they began deploy larger and larger numbers of them, with a primary gating factor of token costs. Referred to by many names, “swarm coding” among them, the practice has become increasingly common as developers and their employers deploy teams of coding agents in an effort to improve overall velocity, code quality and to leverage the differing strengths of varying models.

The obvious problem, then, was how to manage these swarms of autonomous coding agents. Enter Gas Town. Written by Steve Yegge (that Steve Yegge) and billed as a “new take on the IDE for 2026,” Gas Town is a way to manage and orchestrate swarms of code assistant agents – up to 20 to 30 of them – much as Kubernetes does the same for fleets of containers.

It’s overkill for many, Gas Town is merely one initial stab at this problem and Yegge himself goes out of his way to dissuade anyone using less than ten agents from using it. And swarms are not the only option for those seeking greater speed and refinement – Ralph Wiggum is an iterative, looped single agent alternative that has proven very popular with developers.

But it’s clear within the specific domain of code assistance that swarm coding is going to be a primary if not default approach, and it’s equally clear that swarms of agents are going to be deployed across multiple domains in the quote unquote agentic future. If you’re looking for 2026 trends to track, then, how the industry is going to manage the flood of AI agents is a critical question. If it’s not solved, as Yeats said, the blood-dimmed tide will be loosed, and everywhere the ceremony of innocence will be drowned.

GitHub in 2025

Stephen O'Grady — Fri, 07 Nov 2025 15:19:48 +0000

GitHub Copilot was originally released in October 2021, four years ago. So much has happened since, it can be challenging to remember what a revelation it was. As has been discussed previously, it wasn’t that the idea itself was without precedent, but the capabilities, the scope and the scale were without peer. Though the concept of a pair programmer that is available 24 hours a day, seven days a week is table stakes in 2025, in 2021 it was, for many developers, mind blowing.

A little over a year after that, however, OpenAI introduced ChatGPT, the interface to its generative model that could produce code like Copilot, but handle a nearly unlimited list of other tasks – if often imperfectly. Thus opened the era of Large Language Models (LLM), the mercurial era we continue to speedrun through today.

The LLM era has seen models embedded in every software and hardware format in existence, and now is increasingly flowing the other way with software being embedded into models. The large frontier models remain almost infinitely versatile, capable of handling almost any conceivable workload from full real-time speech inputs to text-to-video outputs. The narrower domain of coding assistance, however, opened up widely by Copilot years ago, has likewise evolved rapidly. As it’s evolved, it’s upended core assumptions about the development tools market, most obviously that such tools must be free and would command the loyalty of their users.

The promiscuity of developer tool adoption, in fact, has led to short, accelerating cycles in which a new tool with differentiated capabilities emerges, only to shortly be eclipsed by another new tool with new capabilities and approaches. Copilot had its year before ChatGPT (Nov 2022), which had its year before Cursor (Oct 2023), which had its year before Windsurf (Nov 2024). And that’s without getting into the literally dozens of other tools and approaches on the market, a non-exhaustive list of which would include Aboard, Bob, Bolt, Cline, Claude / Code, Gemini / Jules / CLI, Factory, Kiro, Lovable, Poolside, Replit, Same.dev, vibes.diy, v0 and the list goes on.

Each new tool that has its moment in the sun, however, seems to fly a bit too close to it and inevitably, if potentially temporarily, fall back to earth. Sometimes it’s because of the introduction of another breakthrough tool. Sometimes it’s because the team you’re competing against is working their team to death. And sometimes it’s because the bill for tokens comes due. Regardless, it makes for a market about as predictable as the weather in New England.

Which brings us back to this year’s GitHub Universe, though the event was held on the opposite, and much warmer, coast. There was no launch the magnitude of Copilot at Universe this year. Though the company behind the scenes is thinking hard about what the future of development looks like, this year’s announcements – from Agent HQ to Code Quality to Mission Control to Plan Mode – were more about raising the capabilities floor than its ceiling. And for those expecting a new Copilot, this might have been a let down. That simple conclusion can obscure some subtle, more important takeaways from this year’s event, however.

First, it’s clear that seven years into its acquisition, Microsoft and GitHub are becoming more closely intertwined. Arguably the clearest sign of this was when CEO Thomas Dohmke departed in August and was not replaced, but even at Universe Microsoft personnel were much more visible and tightly integrated into the event than in years past. A greater Microsoft presence does not come without risks, but it also brings immense resources, operational capabilities and – as always – nearly unlimited enterprise account access. In a market that is moving from FOMO to ROI, those things matter. Early signs as well are that as much as Microsoft is integrating with GitHub, GitHub is likewise integrating with Microsoft.
Second, GitHub is shipping again. Team after team, product after product, GitHub has shifted from a cycle of refinement to one of shipping at velocity. The list of features and enhancements announced at Universe numbered in the hundreds. Some of this is driven by competition with the aforementioned competitors actively unconcerned about burning out their teams, but in many cases it’s simply an executive mandate to ship and ship often. Every software company cycles through periods where they ship and periods where they polish, and in the wake of Universe it’s clear that GitHub is in the former.
Last, there is the landscape. The years since Copilot debuted have been something akin to an industry fever dream, in which an unending flow of seemingly magical new capabilities let loose vast spigots of capital investment. Developers flitted from tool to tool with Bohemian abandon. Enterprises said get AI and we’ll figure out what to do with it later. Looking around at the market, however, it is now later. Budgets matter suddenly. Buyers are shifting their gazes from potential to measurable impact. Procurement is pushing for vendor consolidation. In such a climate, the combination of GitHub and Microsoft not only represent a wide range of technical capability, but predictability and stability from an enterprise perspective. Frothy, bubbling markets have a tendency to benefit the incumbents, and in the developer tools space no one is more incumbent than GitHub.

GitHub Universe 2025 may not have broken much new ground from a product standpoint, then, but it was nevertheless a crucially important event for understanding where the company and its parent are headed, and as a consequence, where the industry around them is headed. As for Copilot more specifically, you can’t call it a comeback because it’s been here for years, but if GitHub can keep shipping, the code assist landscape will get a lot more interesting, and soon.

Disclosure: AWS (Kiro), GitHub, Google (Gemini et al), IBM (Bob) and Microsoft are RedMonk customers. Aboard, Anthropic, Bolt, Cline, Cursor, Factory, Lovable, OpenAI, Poolside, Replit, Same.dev, Vercel and vibes.diy are not currently customers.

Anthropic, IBM and the Future of the Enterprise AI Market

Stephen O'Grady — Wed, 08 Oct 2025 17:21:31 +0000

In recent years, funding for AI has been a spigot opened wide. Investors threw money at startups in the space, even more money at those providing hardware for those startups and boards the world over directed their enterprises to embrace AI first and ask questions later.

Recently, however, the economics of the space are facing increasing and unprecedented scrutiny. Enterprises, for their part, have transitioned from a FOMO era to an ROI era as my colleague has captured here. Investors, meanwhile, are beginning to question the circular and potentially problematic deals in the space – see recent pieces from Bloomberg or the Financial Times.

From the latter:

OpenAI has signed about $1tn in deals this year for computing power to run its artificial intelligence models, commitments that dwarf its revenue and raise questions about how it can fund them.

AI, then, is facing headwinds it has not experienced in this generation of adoption. Headwinds which make it imperative that it immediately demonstrates utility and returns to justify the investments. One of the important questions, then, is utility and returns for whom?

Over-generalizing, there are two technology markets: consumer and business. Both markets can produce enormously profitable companies; Apple makes a fortune selling to consumers and NVIDIA does the same selling to businesses. Obviously it’s not that black and white, and there’s crossover and bleed between these categories. Some companies successfully sell to both, but typically companies are built to sell to one or the other because they are very different markets. The marketing and sales motions for each are entirely distinct, the pricepoints and volumes sold differ wildly as do expectations.

This is relevant for AI because to date, while enterprises have poured money into AI in other categories, the largest models outside of the hyperscalers like Amazon, Google and Microsoft have traditionally been consumer focused. OpenAI’s ChatGPT, for example, is famously the fastest growing consumer product in history.

Historically, that type of growth and usage would be the foundation to a solid, growing business even at modest price points. Google, for one, built an enormous business on large volumes of micro-monetization via advertising at an unprecedented scale. For OpenAI and its peers, however, the consumer business – even at wide scale – is unlikely to be enough to either justify the current valuations or meet its incredible spending commitments. Per the FT piece above:

“OpenAI is in no position to make any of these commitments,” said Gil Luria, analyst at DA Davidson, who added it could lose about $10bn this year.

Even if that estimate is off by a factor of ten, it’s clear that for OpenAI and many of its peers, steep costs are not currently being offset by revenues from the consumer sector.

Which brings us to the business sector, and more specifically, the enterprise. Unlike consumers, enterprises have much less price sensitivity. Whether AI has the ability to increase revenues via new lines of business, decrease costs via increased efficiency or both, the appetite for AI within the enterprise – even with some newfound attention on ROI – is clear. But enterprises, as mentioned, are a fundamentally different target than consumers. And this is particularly true for AI.

Unlike traditional technology infrastructure like servers, storage, databases and so on, AI has serious trust barriers to be overcome. If an enterprise purchases a commercial database, or even effectively rents one as a service, it has a clear expectation of privacy because a database company has no fundamental interest in the data stored within. AI providers, however, have an insatiable appetite for data wherever they can get it, which creates at a minimum the perception of a conflict of interest. Despite assurances from AI vendors that they will not train their models on user data, a trust gap remains. Google search trends data shows a spike in “datacenter” queries shortly after the release of ChatGPT. That seems unlikely to be a coincidence.

Beyond the trust gap, there are many more prosaic challenges facing would be entrants to the enterprise market. Do you have an enterprise salesforce? Are you on the approved supplier lists for large businesses? Have your products cleared legal and compliance reviews? Do you have a global on the ground presence to chase leads worldwide?

Recently, one highly profitable trading firm RedMonk has contact with attempted to engage with an AI code assist vendor with the intention of paying them a substantial amount of money. Several initial inquiries went unattended, when a reply was finally received and the vendor understood there were questions about compliance and security, they went dark once again. Enterprises tend to not appreciate that kind of behavior.

With that context, there are two obvious questions. First, if you were an AI provider, how would you try to enter the enterprise market? Second, if you had access to the enterprise market with existing model capabilities, but lacked at least a perceived in house top of the line model, how would you make up that deficiency? The answer at TechXchange this week, as it was once upon a time with Microsoft and OpenAI, was a partnership, specifically Anthropic – creators of the well regarded frontier model Claude – and IBM, an arbiter of what technologies to trust for enterprises for over a century.

Partnerships, like any relationship, are only worth the energy that is invested in them. And while the optics of this one were curious, not only with the Anthropic CEO declining to attend TechXchange in person as is custom with major partnerships, but having the requisite video shot in San Francisco rather than New York, the opportunity in front of both parties is clear.

IBM’s – and its subsidiary Red Hat’s – internal AI focus is clearly on small to medium sized models that are lighter weight, cheaper to run and more easily tailored to the types of discreet, specific problems enterprises are looking to solve. The 4.0 Granite models announced at this event speak to that aim, being positioned not around state of the art capabilities, but – potentially more important in the current climate – a blend of reasonable capability, performance and cost effectiveness. For some IBM clients, however, it will be important to be able to tick a box that says “frontier AI model capability,” hence the importance of the Anthropic announcement and the potentially related subsequent bounce in its stock price.

Anthropic, for its part, is facing many of the same challenges OpenAI is. It has well regarded, incredibly capable models with widespread adoption. The models are extraordinarily expensive to develop and run, however, and the company is unlikely to recoup these costs, let alone turn a profit, by monetizing consumers alone. Which means that the business and more specifically enterprise sectors are a compelling revenue opportunity. That opportunity, however, is dramatically easier to access via a partner, and whatever else may be said about IBM in 2025, it remains a trusted ambassador within large enterprises globally. Not that this is Anthropic’s first attempt at this route to market – Amazon and Google have both already made large, splashy investments in the startup – but if a relationship with IBM can shortcut Claude’s path to legitimate enterprise adoption even at the margins, that’s an opportunity worth pursuing.

Time will tell about the prospects for this particular partnership specifically and the wider AI market generally, but in the wake of this week’s announcement it seems like a clear opportunity for both parties. What’s left is to see what they make of it.

Disclosure: Amazon, Google, IBM and Microsoft are RedMonk clients. Anthropic, Apple, OpenAI and NVIDIA are not currently clients.

DocumentDB and the Future of Open Source

Stephen O'Grady — Tue, 02 Sep 2025 16:44:28 +0000

Daniel Case, CC BY-SA 3.0, via Wikimedia Commons

It can be difficult to remember in the wake of PostgreSQL’s ongoing renaissance, but MySQL was for decades the default open source relational database, which in those days meant it was the default open source database. While the former’s evolution from Ingres was unfolding slowly over the late 1980’s and early 1990’s, MySQL was developed in 1994, released in 1995 and capitalized on the growing popularity of open source to become nearly ubiquitous. It was the default database in the most popular open source tech stack of the time, LAMP, and most developer texts assumed MySQL as the database.

While both MySQL and PostgreSQL were both open source databases, however, their development models were quite distinct. MySQL was built on a dual license model, which allows customers to buy their way out of the licensing terms of the GPL they cannot or choose not to comply with. To be able to issue a dual license, however, MySQL has to own copyright to all of the code outright. In practice, this meant that the development burden was almost entirely on MySQL, because few commercial organizations are willing to write code just so another commercial entity can exclusively monetize it. This was and is the single entity open source model.

PostgreSQL, by contrast, was originally released not under a more restrictive copyleft license like the GPL, but the vanity PostgreSQL license with terms similar to the permissive MIT. Because the license imposed essentially no restrictions on usage of the project, any and all parties were free to use, modify and distribute the software as they saw fit – even if that meant closing the project off and making it proprietary. This licensing choice involved a trade off. On the one hand, no one vendor could exclusively monetize the software, but on the other, it allowed for the growth of a wider ecosystem with more participants in the project. Over time, PostgreSQL became a canonical example along with others like Linux of a multi-entity open source project, where multiple parties – even competitors – come together to collaborate on a project collectively.

For the better part of a decade after MySQL’s release, and for a wide variety of reasons, it dominated PostgreSQL from a visibility standpoint. Since that time, however, its dominance has steadily eroded as Google Trends documents here.

Eroded to the point, in fact, that MySQL has been surpassed by PostgreSQL at least in public visibility over the last five years.

MySQL is still an enormously popular database, to be clear. But it is no longer the dominant project, nor the default. It has ceded that title to PostgreSQL. Where, for example, did Databricks and Snowflake turn when their growth from data lake to more broadly capable data platforms required relational database capabilities? To PostgreSQL vendors – Neon and Crunchy Data respectively. When developers are building applications today, more often than not the assumption is that the database will be PostgreSQL, not MySQL.

There are many reasons for this change in fortunes, and no single project characteristic is responsible for it. But PostgreSQL advocates and vendors typically point to the advantages of its broader community, which again is enabled by its liberal license. Multi-entity open source means more commercial options, which large enterprises favor because competition limits vendors’ leverage with respect to pricing and it limits their risk to vendor behavior and changes. It can also mean broader project support and faster innovation, as evidenced by the speed at which PostgreSQL has been able to adapt to emerging market demands by adding the ability to handle new workloads from JSON to vector.

At present, then, the market is favoring a multi-entity solution for its relational database needs. Redis, meanwhile, dominated the in-memory key-value category for years, but a licensing change created a rift in that community and opened a path for a multi-entity Redis alternative in Valkey. The jury is still out on which path the industry will choose in that case.

All of which brings us to DocumentDB.

Not, oddly enough, the AWS product of that name. Microsoft, apparently, has its own project called DocumentDB which it has donated to the Linux Foundation. This raises an immediate and obvious question about the DocumentDB trademark, but without any answers available that will have to be set aside for the moment other than to note that the LF now owns it.

Microsoft’s DocumentDB is a project built, in essence, to layer MongoDB API compatibility on to a PostgreSQL database. Unlike MongoDB, which is licensed under the non-open source Server Side Public License (SSPL), DocumentDB has been released under an MIT license. That licensing choice, along with its donation to a neutral foundation, is presumably why it was able to attract support from AWS, Cockroach, Crunchy, Google, Supabase, Yugabyte and others joining Microsoft in the launch.

This is not, of course, the first attempt at providing a MongoDB compatible alternative database. Microsoft’s Cosmos DB has been offering this for years, as has AWS with DocumentDB and more recently Google announced MongoDB compatibility for its Firestore product. Percona, meanwhile, offers vanilla MongoDB support. Lastly FerretDB, for its part, has also loudly and prominently positioned itself as a drop-in MongoDB alternative, to the extent that the latter party took exception and sued the former.

Prior to April 2021, MongoDB’s claims likely would have included copyright violations, but in the wake of Google v Oracle, those would seem unlikely charges to be sustained. Instead, MongoDB has alleged patent infringement, misuse of its trademark and unfair competition. As that litigation is still pending, there’s not much to take away from it and it could well be a simple tactic rather than a genuine attempt to prove the claims.

What we know, however, is that MongoDB – unlike some other examples here like Redis – is a single entity project. MongoDB is responsible for the entirety of their codebase, which is why they were able to unilaterally relicense the project from the AGPL to the SSPL in 2018. The SSPL, in fact, is an attempt to preserve MongoDB’s exclusivity by increasing the friction to offering the database as a service.

On the one hand, all of this attention and competition can in some sense be regarded as flattering to MongoDB. Out of all of the possible database projects and document databases, the industry has de facto agreed on MongoDB’s API as the industry standard. It is also possible that being that de facto standard will increase – potentially dramatically – the size of the addressable market, thereby offering MongoDB a larger opportunity to target.

On the other hand, the company clearly believes, as many of its database peer companies do judging by the licensing trends, that exclusivity is key to its present and future success. What’s difficult to see, however, is how that’s achievable in a world in which the Supreme Court has strongly suggested that copyright does not apply to APIs and all three large hyperscalers, the largest open source foundation and a selection of startups all feel comfortable from a legal standpoint publicly invoking MongoDB’s name.

One potential response might be found in the consumer packaged goods sector. It has been common practice for years to have a store brand, and a premium brand. In many cases they are the same, or at least a highly similar, product, and yet the premium brands survive because consumers are willing to pay a premium for an item perceived to be higher value.

The biggest and best opportunity for this model in the technology sector, coincidentally, also came from Microsoft. Many years ago, it was argued in this space that Microsoft – whose .NET stack and C# language were highly regarded technically but virtually non-existent outside of its own Windows ecosystem – could and should have given both of those products a chance to better compete via the Mono project. Originally the brainchild of Ximian, Miguel de Icaza and Nat Friedman’s open source startup, Mono was an alternative, open source friendly .NET runtime. Simply by offering the project a patent amnesty, and thereby removing the Damoclean sword of potential IP litigation, Microsoft could have overnight given itself a credible .NET story for the flood of new Linux servers arriving in market every day. Importantly, it could also have sold effectively against the alternative stack by positioning itself as the premium brand to the store brand. Unfortunately, however, this model was never tested as open source itself was anathema to Microsoft at the time, and a third rail issue that would never go anywhere strategically.

However MongoDB the company navigates the DocumentDB project announcement, it will be worth tracking more broadly the performance of single entity open source projects versus the growing popularity of multi-entity alternatives. Project and product selection is rarely if ever likely to come down to that single characteristic and product decisions are inherently more complicated, but the industry – buyers and sellers of technology alike – is increasingly investing into open source communities that are licensed in such a way that they cannot be controlled by one single entity. What the returns on those investments will be in future will have an enormous impact on the direction of the industry and the health of open source itself.

Disclosure: AWS, Crunchy Data, Google, Microsoft, MongoDB, Oracle (MySQL) and Percona are RedMonk customers. Cockroach, Databricks/Neon, Redis, Supabase and Yugabyte are not currently customers.

The Cyber Resilience Act: A Five Alarm Fire

Stephen O'Grady — Wed, 06 Aug 2025 13:34:13 +0000

On October 21, 2016, CNN’s website was knocked offline. So was the BBC and Guardian’s. Amazon, Etsy and Shopify too, along with Quora, Reddit, and Twitter – among others. Huge swaths of the internet were taken down by a series of attacks on the DNS provider Dyn. These Distributed Denial of Service (DDoS) attacks were conducted by legions of bots, which is to say tens of thousands of internet connected devices that had been compromised by malware called Mirai. It wasn’t an army of servers, or at least not just servers: it included cameras, printers, routers and even baby monitors. All of these devices from the Internet of Things (IoT) had been harnessed and retasked by Mirai, in this case in service of making a substantial portion of the internet unavailable.

The early versions of Mirai principally relied on unchanged default settings; later variations more directly attacked known vulnerabilities in the software running these internet enabled devices.

This attack was possible for two basic reasons.

First, and most obviously, software is hard to secure and protect. All software is vulnerable. Even the best and most dedicated software authors in the world are not able to produce perfectly secure software.
Making matters more complicated was (and is) the fact that there was little if any economic incentive for many providers of these IoT devices to protect them as experts like Bruce Schneier have been saying for years. It is hard to write secure software and keep it up to date and patched, and activities that are hard to do are by definition expensive. Nor were there any real penalties for not investing in producing secure artifacts, which is why most IoT vendors slapped together some cheap hardware and software, shipped it and called it good.

Even for those vendors that might feel obligated either ethically or financially to try and maintain their devices over time there was another critical problem: they didn’t write some or even most of the software on the devices they shipped – open source developers did. This meant that in many cases the device manufacturers did not understand in any meaningful way the software they relied on. It also meant that vendors often had no practical mechanism to get a given piece of open source patched short of begging – or worse, badgering – the developers in question. Developers who were already burning out in huge numbers because of unreasonable demands from users of the project to fix critical issues for free.

The industry reality, therefore, had (and has) a lot in common with a house of cards and was perfectly captured by Randall Monroe here. The European Union, to its credit, looked at this and said “maybe we shouldn’t be building on a house of cards.” Thus was the mandate for the Cyber Resilience Act (CRA) born.

The CRA’s initial remit was to try and tackle the IoT problem. But then came Log4shell. Present from 2013 on, the critical vulnerability in Log4j wasn’t discovered and disclosed until November of 2021. Then, thanks to the ubiquity of Log4j, all hell broke loose. Jen Easterly, the director of the United States Cybersecurity and Infrastructure Security Agency (CISA), called the exploit “one of the most serious I’ve seen in my entire career, if not the most serious.”

In the wake of the catastrophic incident, the EU decided to reconsider and revise the CRA’s mandate. No longer would it be strictly focused on software powering IoT devices. Instead it would encompass software, full stop.

On paper, the CRA makes sense. The world of software, after all, cannot be built indefinitely on a foundation that includes Munroe’s “project thanklessly maintained by a single developer from Nebraska since 2003” as a load bearing component. Change was and is necessary, as are new incentives – and penalties.

The question isn’t, therefore, whether or not something like the CRA is necessary and inevitable. The question is whether or not the CRA as it is written today is the appropriate tool for the job. And after multiple briefings on the subject, it seems safe to say that the jury is still very much out on that subject.

The CRA introduces a broad set of mandates for parties involved in the production of software, with the specific responsibilities varying depending on role. Among the many goals and requirements of the CRA are:

Shipping software free from known exploitable vulnerabilities
Shipping in a secure by default state
Ensuring that vulnerabilities can be patched easily
Making security updates available “for a minimum of 10 years or the remainder of the support period, whichever is longer”
Notifications of actively exploited vulnerabilities within 24 hours and general vulnerabilities within 72 hours both of which should include plans for mitigation and workarounds

The good news is that the CRA in 2025 is much improved from its initial incarnations in 2022. Those were characterized by one participant in the discussions as a “near-death experience for open source.” Indeed, without some of the current carveouts for open source developers, one plausible and even likely outcome would have been geo-fracturing of open source, specifically via the introduction of licenses that prohibited the usage of projects within the EU’s borders. That horrific outcome is potentially still on the table, but we’ll come back to that later.

For now it’s enough to know that open source foundations like Apache, Eclipse, Linux, Mozilla and many others all lobbied hard on behalf of open source and its developers to try and protect them from some of the act’s more far reaching requirements and penalties – it’s almost, as an aside, as if open source foundations could be considered helpful. Their collective efforts have tempered some of the CRA’s more problematic provisions from an open source perspective, but questions still remain, and a host of implementation details and specifications have yet to be finalized so evaluating the act is challenging.

As an example, many of these details are currently being worked on in the Open Regulatory Compliance Working Group (ORC WG), which has a very useful FAQ here. It defines the new term “open source steward,” codified for the first time in this document, and its responsibilities. When it comes to discussing how a steward can demonstrate that its met its reporting and attestation obligations, or what happens if it does not, the FAQ’s answer is “No answer yet.” Likewise, it’s clear that if you’re the creator of an open source project but do not monetize it, you are under no obligations under the CRA. If you monetize it, though, the act’s implications are less clear, and specifically at what thresholds obligations are triggered as the current wording includes legally vague conditions like “the intention of making a profit.” Mere contributors to open source, at least, are specifically exempt from CRA requirements.

If the implications for open source are less dire than in the initial draft, there is one thing that is inarguable: the CRA is a veritable five alarm fire for manufacturers.

At present, and as described above, manufacturers currently rely heavily on a wide variety of open source projects to produce devices of all shapes and sizes. From databases to operating systems to runtimes, open source is the foundation on which everything from baby monitors to cars to lunar rovers rests. Effectively zero manufacturers have commercial relationships with the producer of every project, framework or library they’re incorporating, which means that they are likely to struggle with some of the reporting and attestation requirements. And the penalties if they do are quite severe.

The penalty for non-compliance, for example, is a fine of up to 15,000,000 EUR or up to 2.5% of its total worldwide annual turnover for the preceding financial year, whichever is higher. The penalty for incomplete or inaccurate information, meanwhile, is a fine of up to 5,000,000 EUR or up to 1% of its total worldwide annual turnover for the preceding financial year, whichever is higher.

Based on the financial downsides here, optimistic observers are concluding that the CRA could be a game changer in open source economics. As companies digest the potential penalties involved, they will be obligated to establish commercial relationships with open source projects they currently rely on at no cost. That means more money going from vendors relying on open source to those producing it, which would be a boon for developers. It also raises the question of what happens to product prices when manufacturers are compelled to pay for software they have to date consumed at no cost, but that’s outside the scope of this exercise.

Pessimists evaluating the potential impacts of the CRA on open source software, on the other hand, see a world in which greater commercial interest and monetization is more than offset by a sea of manufacturers flooding maintainers of popular projects with requests – or more likely, given the penalties involved, demands – for project related services to meet the CRA mandated reporting obligations. It also could result in less open source software overall as manufacturers bring some software back in house, or it could produce the aforementioned geo-fracturing as open source developers who do not wish to have anything to do with the CRA either abandon their projects – which the aforementioned ORCWG FAQ felt compelled to discourage – or attempt to prohibit usage of their software within the EU’s jurisdiction.

The CRA’s requirements notably do not take effect until 2027, which is perhaps why it has received so little attention to date. But given the scale and scope of the effort required to comply here, which dwarf those made for GDPR and are perhaps more comparable to Y2K remediation efforts, any manufacturer not already planning for the CRA is behind and likely to be facing a mad scramble in the years ahead.

AI Tooling, Evolution and The Promiscuity of Modern Developers

Stephen O'Grady — Wed, 09 Jul 2025 13:05:24 +0000

Zhixin Sun, Fangchen Zhao, Han Zeng, Cui Luo, Heyo Van Iten, Maoyan Zhu, CC BY 4.0, via Wikimedia Commons

Historically, there have been two constants with developer tools. First, that their users were loyal to them. This was in part attributable to simple baby duck syndrome, but there were practical considerations as well such as keybindings and shortcuts. Many developers were unwilling to invest in retraining their muscle memory to an entirely different context and set of commands, and thus stuck with their text editor or IDE of choice even as a given tool aged and began to lag from a feature standpoint. There were exceptions, of course: Sublime Text attracted a sizable number of former Emacs and vi users, as did VS Code after it. But in general, migration away from popular tools were the exceptions that proved the rule.

The second thing that has been a given for developer tools is that they were free. Those with back problems or who need reading glasses to read a menu might object citing examples like Borland, but the reality is that it’s been decades since developers needed to find real budget to access quality developer tooling. The aforementioned Emacs and vi text editors were free, as was VS Code, and IDEs would soon follow. The open sourcing of NetBeans in June of 2000 by Sun was followed by the formation of the consortium a year later by IBM that eventually became the Eclipse Foundation, which meant developers not only had a free IDE at their disposal, but a choice between them. This industry trend was so strong, in fact, that up until very recently it was effectively impossible for startups in the developer tooling space to attract venture funding. When the competition is both quality and costs nothing, the returns on invested capital are far from certain.

Fast forward to four years ago last month.

On June 29, 2021 – a little less than a year and a half before OpenAI launched ChatGPT – GitHub introduced a brand new product called Copilot. Driven by talent acquired with the Semmle team, Copilot was regarded as a revelation at the time. There were precedents, to be sure: Tabnine, for one, has roots going back to 2013. But the combination of Copilot’s unrestricted access to the corpus of data that is GitHub and its timing proved transformational. With AI just beginning to accelerate in the wake of the publication of Google’s “Attention is All You Need” paper – which introduced the transformer architecture that so much of today’s AI relies on – Copilot lit the software world on fire, triggering dreams of hyper-productive software engineers while simultaneously stoking fears of wide scale developer unemployment.

GitHub was the first company in decades to buck precedent, and prove conclusively that it was possible – given the right product – to charge for developer tooling. Within two years, in fact, it was a $100M ARR business – an unimaginable figure given how reluctant developers and enterprises alike had been to pay literally anything for the primary tool of a developer’s trade.

If the second foundational assumption was shattered, however, surely the first would hold. It was widely assumed that developers, who already had had a long term affinity for GitHub itself, would demonstrate their characteristic loyalty to the coding assistant they had first imprinted on.

Except they did not, and do not.

What GitHub did in two years, in fact, Cursor did in 12 months: a year in, the company – with reportedly zero marketing – hit a hundred million run rate. Many of its users were former – and potentially future, as we’ll come back to – Copilot users.

Almost everything we knew, then – or thought we knew – about the developer tools space turned out to be wrong. Promiscuity has replaced loyalty, but the good news is that the budget is no longer anchored to zero dollars. There is no single reason for these developments, as there are a number of contributing factors.

Arguably the most important is the degree to which AI is inherently a transformational technology. With its ability to ingest, process and act on natural language, for example, inputs are often now a prompt or a spec – for neither of which keybindings or shortcuts matter particularly. AI has also has heralded a massive era of experimentation as vendors and projects seek creative new ways to apply the technology to the task of developing software. There is, at present, no consensus, no dominant approach, and there may never be. Some developers prefer the more free-wheeling “vibe code” approach via prompts; others prefer the more deliberate spec driven development – in many respects emulating the divide between authors of fiction who go by the seat of their pants versus those who rely on predetermined plots. In some cases developers want to be gradually stepped through proposed steps or changes; in others they just want the machine to come back when it’s done. Some tools merely propose changes, others like the recently relaunched Aboard will combine development with a full stack including a database. Some tools retain the UI elements of traditional IDEs; others are nothing more than a text field.

In short, we are in the midst of a Cambrian explosion of developer tools, and a dizzying array of approaches are currently being tested for their evolutionary fitness. Consider even an abbreviated, absolutely non-exhaustive list of related tools: Aboard, Bolt, Cline, Copilot, Cursor, ChatGPT / Codex, Claude / Code, Gemini / CLI, Factory, Lovable, Poolside, Replit, Same.dev, vibes.diy, v0, watsonX and Windsurf. Not all of these will succeed, and indeed some argue that all of these are doomed because the economic footing they’re built on is fatally unsound. That argument is built on two core assumptions, however: that vendor costs will never come down and that user costs can not be raised – neither of which seems entirely safe. More likely is that some of these options emerge and fundamentally change the way the industry builds software moving forward. Others, meanwhile, will be abandoned as dead ends in a manner consistent with both biological evolution and technical innovation. Developers, regardless, are far more willing to experiment with new tools, because they are fundamentally differentiated from one another in ways that past generations of developer tools have typically not been.

Also fueling the willingness to flit from tool to tool are cost and token inventory concerns. While developers are now objectively willing to pay for tools, they still appreciate free tiers and will use up whatever resources are made available to them at no cost. For paid plans, meanwhile, developers are frequently outstripping their allotted inventory of tokens at their given paid tier, and simply move on to the next tool with available credits. Whatever their tooling preferences, therefore, in some cases costs lead them to use deploy multiple distinct developer tools in development on the same application – again, a practice which would have been unthinkable even a few short years ago.

Lastly, there is the firmly engrained idea amongst developers that these tools are useful. The organizational metrics may argue otherwise, but a clear majority of individual developers feel more productive. There are, again, many who would argue that the tools are inherently unusable, inherently uneconomic, inherently immoral or some combination of all three. But that is, at this point, the minority opinion and as stated eloquently here it seems extremely unlikely the tools will be uninvented. As such, developers will both keep using the tools, and will be willing – if reluctantly – to pay for them – likely even if the costs escalate, which creates the worrying potential for a greater economic divide. To that end, some developers are willing to pay to the degree that my colleague recently reported that in contrast to some skeptical enterprises that balked at the idea of paying $20-$40 a month per developer, a developer acquaintance of his recently stated that if one wasn’t spending hundreds of dollars a month on AI tooling they were not a serious developer. Which, again, raises the spectre of haves and have nots. A world in which the best developer tools cost nothing was a world with fewer barriers to entry, after all.

But if it’s clear that the rules have changed with respect to developer tools, the implications of this are more opaque. A few conclusions suggest themselves, however.

It’s Not Too Late: if it wasn’t too late for Cursor in the wake of Copilot’s explosive growth, it’s not too late for the next Cursor. Which, given that that was Windsurf, which was valued at $3B, seems to demonstrate the point adequately. There will remain opportunities for these businesses for the foreseeable future. There is room for experimentation, for user acquisition and for vendors that charge for developer tools. So while rumors suggest as one example that AWS has a new tool on the way – purportedly called Kiro – its window would still be open. There is also, importantly, opportunity for products that have “lost” users to gain them back, as developer promiscuity cuts both ways.
Partnerships Will be Important: under appreciated currently, at least as far as enterprise usage is concerned, is the white space that remains. All of the tools take novel approaches to accelerating software development. The majority, however, are narrowly focused on some aspect of the application development process, and like GitHub once upon a time, leave anything else as a problem to be solved by some combination of customers and partners. Tool vendors and downstream build, test, observation and deployment targets alike would be wise to start integrating to close the gaps in their developer experience. Importantly, however, this will require AI companies willing to engage with third parties, something many of them have seemed too busy to do to date.
Lack of an Approach Consensus Will Slow Enterprise Adoption: speaking of gaps in the developer experience, after the publication of the linked piece, RedMonk heard from dozens of organizations who perceived the same issue and were attacking the problem with a new product and/or company. The challenge was that they all took slightly different approaches to addressing the developer experience gap. As a result, enterprises struggled to compare the proverbial apples to oranges and the market for tools that would impact the problem lagged. AI tooling will be less susceptible to this, but the sheer variety of different approaches will suppress adoption to a degree as businesses are forced to wade through the variety of approaches the tools represent in an effort to decide which will be most impactful for their particular needs.

Evolution, at its core, is always a messy, non-linear process, and AI tooling will be no exception. But it inexorably hammers, reshapes and refines models, producing output that is ever more fit for purpose. And in that, too, AI will be no exception.

Disclosure: AWS (Kiro), GitHub (Copilot), Google (Gemini / CLI) and IBM (watsonX) are RedMonk customers. Aboard, Anthropic (Claude), Bolt, Cline, Cursor, Factory, Lovable, OpenAI (ChatGPT), Poolside, Replit, Same.dev, Tabnine, vibes.diy, Vercel, and Windsurf are not currently RedMonk customers.

The RedMonk Programming Language Rankings: January 2025

Stephen O'Grady — Wed, 18 Jun 2025 17:10:20 +0000

This iteration of the RedMonk programming Language Rankings is brought to you by Amazon Web Services. AWS manages a variety of developer communities where you can join and learn more about building modern applications in your preferred language.

Even by our standards, dropping the Q1 programming language rankings the same month we run the Q3 numbers is quite the delay. While the usual travel and school vacation delays have applied, however, the drawn out process in this case is deliberate on our part. As has been discussed in recent iterations of these rankings, the arrival of AI has had a significant and accelerating impact on Stack Overflow, which comprises one half of the data used to both plot and rank languages twice a year.

My colleague Rachel has been studying this impact in detail and has more on it here, but for our purposes it’s enough to know that Stack Overflow’s value from an observational standpoint is not what it once was, and that has a tangible impact as we’ll see. Still to be determined on our end is whether Stack Overflow should continue to be used, and if not what a reasonable alternative might be. Stay tuned for more details on that front when we get to the Q3 rankings, which will presumably be in Q1 of next year.

In the meantime, however, as a reminder, this work is a continuation of the work originally performed by Drew Conway and John Myles White late in 2010. While the specific means of collection has changed, the basic process remains the same: we extract language rankings from GitHub and Stack Overflow, and combine them for a ranking that attempts to reflect both code (GitHub) and discussion (Stack Overflow) traction. The idea is not to offer a statistically valid representation of current usage, but rather to correlate language discussion and usage in an effort to extract insights into potential future adoption trends.

Our Current Process

The data source used for the GitHub portion of the analysis is the GitHub Archive. We query languages by pull request in a manner similar to the one GitHub used to assemble the State of the Octoverse. Our query is designed to be as comparable as possible to the previous process.

Language is based on the base repository language. While this continues to have the caveats outlined below, it does have the benefit of cohesion with our previous methodology.
We exclude forked repos.
We use the aggregated history to determine ranking (though based on the table structure changes this can no longer be accomplished via a single query.)
For Stack Overflow, we simply collect the required metrics using their useful data explorer tool.

With that description out of the way, please keep in mind the other usual caveats.

To be included in this analysis, a language must be observable within both GitHub and Stack Overflow. If a given language is not present in this analysis, that’s why.
No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
Languages that have communities based outside of Stack Overflow such as Mathematica will be under-represented on that axis. It is not possible to scale a process that measures one hundred different community sites, both because many do not have public metrics available and because measuring different community sites against one another is not statistically valid.

With that, here is the first quarter plot for 2025.

1 JavaScript
2 Python
3 Java
4 PHP
5 C#
6 TypeScript
7 CSS
7 C++
9 Ruby
10 C
11 Swift
12 Go
12 R
14 Shell
14 Kotlin
14 Scala
17 Objective-C
18 PowerShell
19 Rust
20 Dart

If you’re tracking from our last iteration of the rankings – and Rachel has the entire history of the Top 20 rankings charted here, the only change within our Top 20 languages is Dart dropping from a tie with Rust at 19 into sole possession of 20. In the decade and a half that we have been ranking these languages, this is by far the least movement within the top 20 that we have seen. While this is to some degree attributable to a general stasis that has settled over the rankings in recent years, the extraordinary lack of movement is likely also in part a manifestation of Stack Overflow’s decline in query volume. As that long time developer site sees fewer questions, it becomes less impactful in terms of driving volatility on its half of the rankings axis, and potentially less suggestive of trends moving forward. As mentioned above, we’re not yet at a point where Stack Overflow’s role in our rankings has been deprecated, but the conversations at least are happening behind the scenes.

With that, some results of note:

TypeScript (6): even acknowledging the general lack of movement within, it’s notable that TypeScript has effectively stalled just outside the Top 10. On the one hand, it can piggyback on the ubiquity of JavaScript while offering important safety provisions, but on the other, it has a reputation of not scaling particularly well. This reputation, in fact, has led Microsoft to reimplement the TypeScript compiler and tools in Go. The question now is whether this reimplementation will lead to greater performance, leading to greater adoption and more usage, or whether the fact that Microsoft felt it needed to be reimplemented in the first place could throw shade on the language. It will be interesting to watch, assuming we have data enough to observe any potential impact.
Kotlin (14) / Scala (14): both the JVM-based languages held their gains from our last ranking and it’s unclear what their prospects are for moving up more significantly. In 2015, when Go entered our rankings at 17, Scala was at 14 and jumped up briefly to 11 two years later. In 2023, however, Go passed Scala – having already been ranked above Kotlin – and has maintained that role ever since. And with Go finding new fans in companies like Microsoft and Rust making gains among other server side workloads, particularly those with security concerns, Kotlin and Scala’s growth paths are not assured.
Dart (20) / Rust (19): while Dart technically dropped one spot, that far down the rankings the actual differences are marginal at best. These two languages, which have little to nothing in common and are aimed at very different users and workloads, have tended to move in lockstep and this quarter’s run does not represent much of an exception in that regard.
Ballerina (64) / Bicep (79) / Grain / Moonbit / Zig (86): among the “languages we’re paying attention to” set, there was little more movement than within our Top 20, and for the most part movement among them was down. Grain and Moonbit remained unranked, while Ballerina dropped from 61 to 64, Bicep dropped from 78 to 79. Zig, however, did manage to jump, if only one spot from 87 to 86 – it probably does not hurt that Mitchell Hashimoto is a major fan. It is worth noting for these emerging languages, however, that they may be disproportionately impacted by Stack Overflow’s decline. In every case where the languages are ranked, they perform better within our GitHub rankings than they do within Stack Overflow’s: 62 vs 66 for Ballerina, 69 vs 73 for Bicep and 70 vs 83 for Zig. In Zig’s case in particular, then, it is possible that faster growth in code as measured by GitHub is being dragged down by the steep decline in query volume on Stack Overflow. Which is yet another reason why we’re carefully evaluating our options moving forward, but in the meantime we’ll keep all of these languages on our “to watch” list.

Credit: My colleague Rachel Stephens wrote the queries that are responsible for the GitHub axis in these rankings. She is also responsible for the query design for the Stack Overflow data.

Beyond Code: APIs as the Next OSS Battleground

Stephen O'Grady — Mon, 09 Jun 2025 19:55:36 +0000

On August 13th, 2010, Oracle sued Google over copyright and patent infringement claims relating to the reimplementation of the Java runtime within its Android platform. The suit took over a decade to resolve, and had several major twists and turns, but ultimately the Supreme Court decided in Google’s favor on April 5th, 2021. Among the items at stake in this trial were the question of whether APIs were copyrightable, which is another way of saying the immediate future of the technology industry hung in the balance.

In its decision, the Supreme Court did not declare APIs immune from copyright, but rather held that Google’s use of the Java APIs constituted fair use. While it was not a total victory for those who would see APIs explicitly walled off from such concerns, it significantly raised the bar for legal challenges based on competitive usage of APIs. This was immediately relevant, as a loss would have almost certainly led to a widespread chilling effect across APIs industry-wide.

But Google vs Oracle is also critical to what may be the next front in the ongoing conflict between open source and commercial open source: APIs.

Those who have tracked popular open source projects such as PostgreSQL have likely heard a familiar observation amongst authors of the original project: that a database, for example, with Postgres API compatibility is not the same as a Postgres database. Databases that offer Postgres compatibility like AWS’ Aurora or Google’s AlloyDB, these fans argue, may not be fully compatible because of slight differences between the implementations, feature additions or omissions and more.

What cannot be argued is that the API for a large, successful and widely adopted software project is an enormously valuable asset. What might be argued is that it is possible, in certain cases, that the API is more valuable than the underlying code it represents. The underlying code for an API can and has been reimplemented in clean room settings, while the API must be a fixed point for developers.

With large projects that are maintained by multiple third parties such as Postgres, the potential friction from API reimplementations is minimal. By virtue of being a project worked on by many commercial vendors, there is no real exclusivity offered or claimed by the API.

The dynamics for single entity projects or open source projects developed primarily or solely by a single vendor, however, are quite another matter.

For many years now, open source projects and database projects specifically have developed a pattern or lifecycle from a licensing standpoint. Initial development is conducted under a typically permissive open source license, in which control is traded for usage and distribution growth. Once certain usage thresholds are met, and attract commensurate funding – venture or otherwise – permissive licenses are discarded in favor of licenses offering much stronger protections, up to and past the edge of what the definition of open source permits. These licensing “rug pulls” may have eased somewhat, in that commercial vendors appear to be pulling back from source available licenses and finding an equilibrium around the strongest copyleft license in the AGPL, but the justification is the same: exclusivity.

In short, whether it’s the AGPL or non-open source, source available alternatives, the end goal for relicensing is to try and capture the vast majority or entirety of the revenue associated with a given open source project rather than share it with other vendors, particularly large hyperscalers. Many source available licenses explicitly forbid other companies from monetizing the licensed code. The AGPL, meanwhile, does not forbid third parties from monetizing a given codebase, but it does require them to share any changes or fixes they make – a practice that many avoid as a rule. Thus a project can be technically open, but practically speaking only monetized by the original author of a given project.

But what about their APIs?

In January of 2019, AWS released a long suspected new database, DocumentDB. It was, as might be guessed, a document database, and one specifically that offered some MongoDB compatibility. MongoDB had, one quarter prior, relicensed its database from the AGPL to the much more expansive SSPL. This was ostensibly an effort to thwart competition from the likes of AWS, but the timing made it clear that AWS wanted no part of even the less protective AGPL and had instead done a clean room reimplementation of MongoDB’s API to offer a datastore theoretically compatible with Mongo, but built on their own stack not subject to the requirements of the AGPL – or the SSPL for that matter. .

This all having taken place almost two years before the landmark Google v Oracle decision, however, AWS was very careful to state that its API compatibility was only up to the version last licensed as the AGPL. No one at the time had any real legal certainty on whether APIs were copyrightable and thus proprietary.

In the years since, as discussed, the industry does not have certainty, precisely, but it has made assumptions in the wake of the trial, one of them being that APIs are for all intents and purposes non-proprietary.

Which brings us to this news from late May, in which MongoDB announced that they had asked FerretDB to “stop engaging in unfair business practices.” Their claims are based on assertions that Ferret:

Misleads and deceives developers by falsely claiming that its product is a “replacement” for MongoDB “in every possible way and
FerretDB has infringed upon MongoDB’s patents.

Two things stand out immediately. First, that Mongo’s claims ultimately reduce to trademark and patent infringement matters, and second that neither API nor copyright are mentioned once. Setting aside the relative merits or lackthereof of these claims, which are best left to those with legal backgrounds, courts or both, the important question is whether this case is a one off or the shape of things to come.

Commercial open source projects have struggled to maximize their revenue exclusivity for years, primarily through the aforementioned series of relicensing efforts. Those efforts, however, are based on copyright as it applies to source code. If copyright doesn’t apply to APIs, or if the bar for fair use is low enough to be easily achievable from a legal standpoint, that may suggest a future in which competitive third parties “island hop” the source code and go straight for the APIs. Given the size and usage base of some of the commercial open source projects, the economic incentives to do so are substantial indeed. APIs are ultimately a door for developers, and if that door can open to your products as easily as the original author’s, that will likely be of interest regardless of what the license on the original source code might be.

The upside to the Google v Oracle ruling was clear, in that an industry in which every last programming interface was considered proprietary would be a tectonic, systemic shock. The downside, though, is that we now have to hope that we don’t see a resurgence in interest in “embrace, extend, extinguish” efforts from large third parties trying to co-opt open source projects and user bases.

Either way, it seems likely that the next wave of conflict won’t be over licenses pertaining to code, but the APIs they implement.

Disclosure: AWS, Google, MongoDB and Oracle are RedMonk customers. FerretDB is not currently a RedMonk customer.