Stories by Benjamin Cane on Medium

Your coding agent is missing one thing: architectural context

Benjamin Cane — Thu, 28 May 2026 00:00:14 GMT

Photo by Sven Mieke on Unsplash

Your coding agent is missing one thing: architectural context.

I’ve been a big believer in Architecture Decision Records ( ADRs) long before coding agents came along.

Documenting decisions gives engineers context:

Why is the system designed a certain way? What constraints existed at the time? What tradeoffs were made?

That context matters. It also matters for agents.

🤖 Agents Need Context Too

Unlike human engineers, agents don’t get context from hallway conversations, shadowing others, or tribal knowledge.

They only know what you capture. The best way to capture architectural context? Write it down as a decision record-and make it accessible to agents.

The only question is, what’s the best way to make decision records accessible?

🏗️ Option 1: MCP Server

If your ADRs live in a wiki or documentation system, you can expose them through an MCP server.

This works well when documentation is spread across teams or multiple systems that need to be aggregated.

You want a unified interface for agents. MCP is a good approach, but it comes with some infrastructure overhead.

🧱 Option 2: Keep ADRs in Git

I’ve long preferred storing ADRs in Git.

It provides versioning, review workflows, automated validation, and is where engineering work happens. Storing ADRs in Git, ideally alongside your code, is the fastest way to give agents usable context.

The challenge is that architecture often spans multiple services and repositories. So many centralize their architecture into a single repository, which is not where your code lives.

🌉 Bridging the Gap

Most modern coding agents let you include additional directories or sources at runtime, either through slash commands or CLI options.

That means you can: open your codebase, include your architecture repository, and run the agent with context.

Just adding another directory gives your agent an understanding of system constraints, architecture decisions, technology choices, and surrounding systems. These are not things an agent can reliably infer from code alone.

💡 Final Thought: Why Context Matters

With architectural context, agents produce code that aligns with your system.

When engineers understand the system end to end, they make better decisions. The same applies to agents.

If you want better results, give better context.

Originally published at https://bencane.com on May 28, 2026.

Your coding agent is missing one thing: architectural context was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

Health-check the listener your gRPC traffic actually uses

Benjamin Cane — Thu, 21 May 2026 00:00:30 GMT

Photo by Joshua Chehov on Unsplash

One of the easiest ways to break a gRPC service in production is health-checking the wrong listener.

A common issue I see teams run into when adopting gRPC is leaving readiness checks pointed at their HTTP listener while production traffic actually flows through gRPC.

Everything looks fine until it suddenly doesn’t.

🤔 The Problem

Many gRPC services run two listeners: one for HTTP and one for gRPC.

The HTTP listener often exists for metrics, liveness checks, and management APIs. Teams moving to gRPC often reuse the HTTP health checks they set up for their REST-based services.

It’s generally a good idea to reuse what you already have, but in this case, it can be misleading.

⚠️ Health-Check What Serves Traffic

If customers connect through gRPC, your first readiness check should too.

Your HTTP listener can be perfectly healthy while the gRPC listener is misconfigured, hung, or otherwise failing.

Meanwhile, Kubernetes, load balancers, and dashboards might all show green. ✅

This happens more often than people think.

🩺 Better Ways to Monitor gRPC

There are better ways to monitor your gRPC service.

gRPC Health Probe ✅

Use a real gRPC health check request against the listener.

This validates the actual serving path and confirms the service can respond over gRPC.

A strong default option.

Build a Status gRPC Service 📋

Expose an internal status method in your gRPC API.

This gives you flexibility to check deeper dependencies, such as database readiness, downstream systems, internal state, and maintenance toggles.

It’s more work, but more control.

Use a Single Shared Listener ☝️

Because gRPC runs on top of HTTP/2, many languages and frameworks can serve HTTP and gRPC traffic on the same listener.

That means an HTTP health endpoint may be acceptable because it checks the same network path. It still does not fully validate gRPC behavior, but it is better than checking an entirely separate listener.

🧠 Final Thoughts

gRPC is awesome.

But making a service production-ready means revisiting configurations inherited from REST services.

Health checks
Load balancing behavior
Connection management
Contracts
Operational tooling

None of these changes are difficult. They’re just easy to miss.

Originally published at https://bencane.com on May 21, 2026.

Weighted load balancing has saved me more times than I can count

Benjamin Cane — Thu, 14 May 2026 00:00:51 GMT

Photo by Julian Hochgesang on Unsplash

Weighted load balancing has saved me more times than I can count.

Many engineers think of load balancers as simple traffic distributors.

Send requests across servers. Keep systems available. Move on.

But one of their most valuable capabilities is often overlooked. Weighted load balancing.

🤨 What Is Weighted Load Balancing?

From enterprise hardware appliances to software load balancers like HAProxy and Envoy Proxy, nearly all modern load balancers support some form of weighted routing.

Start with a standard balancing algorithm, such as round-robin. Then apply weights so some targets receive more traffic than others.

For example, if two targets are weighted at 90 and 10, roughly 90% of traffic goes to one target and 10% to the other. If targets have equal weights, traffic is typically distributed evenly.

Simple idea, critical feature.

🤔 Why It Matters

Weighted load balancing turns migrations from risky big-bang cutovers into small, adjustable dials.

Instead of flipping traffic all at once, you can gradually shift production traffic while observing behavior in real time.

That means a smaller blast radius, easier rollbacks, and safer production migrations.

🧰 What I Actually Use It For

I’ve rarely used weighted load balancing because one server had more capacity than another.

What I’ve used it for repeatedly is change management.

Ten years ago, to migrate from a legacy file transfer platform to newer platforms. We used weighted load balancing to introduce the new platform gradually.

Six years ago, to control which transactions were routed to our old card payments platform versus the new platform, we introduced weighted load balancing in our global transaction router.

Last night, to run a canary deployment, we adjusted our service mesh using weighted routing via xDS.

Different eras, different platforms, same core concept.

🕰️ Standing the Test of Time

Weighted load balancing is not new. It has existed for a long time.

Foundational patterns often become the enablers for newer platform practices. Canary releases, blue/green deployments, service mesh traffic shifting.

Many of those ideas rely on the same underlying capability: controlling traffic through percentages and weighting.

Good patterns tend to survive generations of technology.

🧠 Final Thoughts

Many software engineers will never build a load balancer. They will never configure one themselves.

But understanding what these systems can do is still an advantage.

Because migrations are often less about code and more about traffic control.

Writing good software isn’t enough. Knowing when and how to shift 1% of traffic can make or break a migration.

Originally published at https://bencane.com on May 14, 2026.

Weighted load balancing has saved me more times than I can count was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

YOLO Is a Terrible Strategy for Validating Production Changes

Benjamin Cane — Thu, 07 May 2026 00:00:17 GMT

Photo by bijesh regmi on Unsplash

YOLO is a terrible strategy for validating production changes.

How many times have you seen it?

Your platform is running smoothly. No alerts, no issues. Then suddenly, something breaks.

After digging in, you discover the cause: another system you depend on made a change, and that change broke your platform.

They didn’t notice it broke. You did, much too late…

How many times have you been the cause of another platform breaking?

🥶 Cold Reality

I wish the above scenario were rare, but it happens constantly across the technology industry.

It happens between internal teams, third-party integrations, and shared infrastructure teams.

These scenarios make you wonder, “How was that change validated?”

Maybe they tested it, and their validation had gaps. Maybe they did little validation at all. If any.

Either way, the result is the same: they validated their change with 100% of production traffic. Bad plan.

💡 Better Ways to Validate Changes

There are many ways teams can reduce production risk when rolling out changes, and the best teams combine the following approaches.

Canary Releases 🐤

I talk about canary deployments often.

Instead of moving 100% of traffic at once, move small percentages gradually and observe behavior closely.

That observed part matters. Look at error rates, latency changes (beyond normal platform warmup), resource spikes, and unexpected retries. All of these indicate customer impact.

Canary deployments are one of the best ways to reduce the blast radius of changes, identify problems quickly, and self-correct.

Shadow Traffic 🪞

Traffic mirroring sends production traffic to a new version before routing live traffic there.

Responses are ignored, but you observe behavior and monitor the same signals you would with a canary release without sacrificing a customer request.

Synthetic Traffic 🤖

Synthetic traffic simulates user behavior continuously. It’s great for monitoring customer experience, but also a great way to validate new deployments.

Route synthetic traffic to upgraded instances first and verify behavior before moving real traffic. If it fails with synthetic traffic, it likely won’t survive real traffic.

Smoke Tests 😶‍🌫️

The classic approach. After deployment, run a small set of fast tests to confirm the platform is fundamentally working.

Smoke tests don’t need to be fancy; they can be shell scripts, API calls, read-only requests, a test file, or full end-to-end validation.

Their purpose is simple: to quickly catch obvious breakage.

🧠 Final Thoughts

Don’t think of the above methods as mutually exclusive choices. Combine them.

Some platforms I work on combine canary releases, shadow traffic, and synthetic traffic. Others use smoke tests plus canary releases.

The more layers of validation you have, the more likely you are to catch issues before your customers do. Because having your customers validate changes for you is a poor strategy.

Originally published at https://bencane.com on May 7, 2026.

YOLO Is a Terrible Strategy for Validating Production Changes was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

Deterministic routing is one of the most effective ways distributed systems reduce consistency…

Benjamin Cane — Thu, 30 Apr 2026 00:00:44 GMT

Deterministic routing is one of the most effective ways distributed systems reduce consistency problems at scale

Photo by Egor Myznik on Unsplash

Deterministic routing is one of the most effective ways distributed systems reduce consistency problems at scale.

It is a foundational technique used by many modern databases, caches, and large-scale platforms. Understand how it works and you can apply the same pattern in your own systems.

🤔 Understanding the Problem

At some point, every successful system hits the limits of a single database instance.

A single server can only handle so many connections, queries, writes, storage capacity, or CPU/memory demands. Even with the best hardware, performance eventually degrades. So systems scale horizontally.

Instead of sending all traffic to a single database server, requests are distributed across multiple nodes.

At the same time, resiliency matters. If one server fails and all data resides there, the outage can be severe.

So modern databases spread data across multiple nodes, availability zones, and regions.

Distributing load and data solves both capacity and resiliency problems. But it introduces another challenge.

How do you keep request behavior consistent when data is distributed across multiple systems?

⚠️ Why Replication Is Not Enough

Replication helps, but it does not solve every consistency problem.

Imagine a write lands on Server 1. Immediately after, a read request for the same data lands on Server 67. Will Server 67 have the latest version? Maybe, but often not.

Asynchronous Replication

With asynchronous replication, Server 1 will accept the write and replicate the data to other servers in the background. That means a follow-up read on any other node may return stale data.

Synchronous Replication

With synchronous replication, the write on Server 1 will wait for an acknowledgment from all replicas before returning a success. While this improves consistency guarantees, it increases latency.

The farther apart a replica is, the worse this gets. Local writes may be fast, but cross-region writes will be slow. Plus, is it really feasible to replicate data across every single node?

So the question becomes: How do you preserve consistency, without paying latency taxes?

🔀 Route Requests to the Data

A highly effective answer is deterministic routing.

Instead of moving data to where requests might land, move requests to where the data already exists.

If requests for the same key can go to the same node, you gain predictable ownership, reduced stale reads, lower coordination overhead, and easier horizontal scaling.

👨‍🏫 How Deterministic Routing Works

At a high level, the system needs a repeatable way to decide where requests should go.

A common approach is hashing.

A hash of user123 always goes to Node 7
A hash of user456 always goes to Node 42

As long as the same key produces the same result, requests can be consistently routed to the same owner. Many modern databases implement deterministic routing through techniques like consistent hashing, partition maps, and shard ranges.

🗺️ Where Routing Logic Lives

Different systems solve routing in different places.

Client-side Routing

The client library knows the partition map and sends requests directly to the correct node. Used by many distributed caches and databases.

Proxy / Router Tier

A small router sits in front of nodes and forwards traffic appropriately. Useful when client behavior cannot be influenced.

Server-side Forwarding

Requests land anywhere, and the receiving node forwards internally to the owning node. Simple for clients, doesn’t introduce a proxy failure point, but introduces complex cluster discovery/health monitoring.

Each model has tradeoffs.

🧰 Routing Does Not Replace Replication

Deterministic routing is powerful, but not magic. What happens when the owning node is down? You still need replication.

Modern databases combine both: deterministic routing for performance and ownership, plus replication for durability and failover.

🧠 Why This Matters Beyond Databases

Distributed databases use this approach, but it is not unique to them.

Deterministic routing can be used to solve: session ownership, user affinity, in-memory workflow coordination, work queue partitioning, and more.

I’ve used deterministic routing many times to solve load distribution and consistency problems.

At scale, the answer is not always more/better hardware. Consistency and availability problems are not always solved with replication alone.

Sometimes the best answer is simply to send the request to the right place.

Originally published at https://bencane.com on April 30, 2026.

Deterministic routing is one of the most effective ways distributed systems reduce consistency… was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

When you think of microservices, you probably think of centralized shared services

Benjamin Cane — Thu, 23 Apr 2026 00:00:27 GMT

When you think of microservices, you probably think of centralized shared services. But there’s another valid pattern that is rarely discussed

Photo by Frames For Your Heart on Unsplash

When you think of microservices, you probably think of centralized shared services. But there’s another valid pattern that is rarely discussed: running the same microservice inside multiple platforms.

🧩 How It Usually Works

Most microservice designs follow the same model:

Break systems into capabilities, teams, or functions
Deploy one shared service for each capability
Any platform that needs it calls that centralized service

That works well for many cases, but it’s not the only model.

🏗️ How We Got Here

Before microservices, many organizations used Service-Oriented Architecture (SOA).

Despite being labeled as antiquated, SOA and microservices are not that different. Both break down systems into capabilities that communicate with each other. The biggest difference is scope.

In SOA, a “Payments Service” might own:

Message parsing
Validation
Balance checks
Currency conversion
Settlement logic

While other SOA services would own “Users” or “Accounting”. Today, that payment service would be considered an entire platform, with each of those capabilities implemented as microservices within that domain.

Microservices are often the same idea as SOA, just at a more granular level.

🎯 Why Centralization Became the Default

One reason microservices gained traction was the need to avoid duplication. Capabilities were often rebuilt across multiple systems. For example, Currency Conversion is needed in Payments, Accounting, and many other platforms.

Duplication is not just wasteful, it creates real problems: logic drift, coordination overhead, and inconsistent outcomes across systems. Packaging that capability as a standalone service solved real problems: build once, reuse everywhere.

⚠️ The Downside of Centralization

In cell-based architectures, platforms are usually designed to be self-contained and failure-isolated. That means a mission-critical platform depending on a centralized service shared by other platforms can become a design smell.

Cross-cell dependencies
Added latency
Shared failure domains
Complex failover scenarios

So teams, once again, solve these problems by rebuilding the same capability locally.

🔁 Another Option

Instead of rebuilding the capability each time, deploy the same microservice codebase inside multiple platforms. If both Payments and Accounting need a currency conversion service, deploy the same service within each platform.

It’s the same codebase and capability, but with local ownership and resilience. You get reuse without forced centralization.

🧪 Caveats from Experience

This pattern works when applied carefully.

1️⃣ Strong Ownership

A shared codebase still needs a clear owning team. Others can contribute, but someone must own quality, roadmap, and releases.

2️⃣ Pick the Right Capabilities

Not everything is a great fit. Something like currency conversion is well-scoped, relatively stateless, and doesn’t have unique business logic based on which platform is calling it. It’s a strong example.

But other services that have unique logic for each platform domain or require consistency across different platforms are less of a fit.

3️⃣ Operational Discipline

Using the same codebase doesn’t automatically solve all problems; you can still run into drift across platforms if each is running a different version. Changes in behavior still sometimes need coordination.

But with a single codebase, these issues are far easier to address.

💭 Final Thoughts

Microservices gave us reusable building blocks. Sometimes the best use of a microservice is not one centralized deployment. Sometimes it’s many local deployments of the same capability.

Just reuse the software while maintaining autonomy.

Originally published at https://bencane.com on April 23, 2026.

When you think of microservices, you probably think of centralized shared services was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

Are you using traffic mirroring in production? If not, try it out

Benjamin Cane — Thu, 16 Apr 2026 00:00:59 GMT

Photo by Rishabh Dharmani on Unsplash

Are you using traffic mirroring in production? If not, you might be missing one of the safest ways to test and observe production changes.

🚦 What is Traffic Mirroring?

Traffic mirroring in Istio or Envoy Proxy lets you send a copy of live traffic to a secondary target.

When enabled, traffic to /service routes to cluster1 as normal, and a mirrored copy is sent to cluster2.

The key: mirrored traffic is fire-and-forget. Responses are ignored and never impact the primary request.

🧪 Why It’s Powerful

1️⃣ Shadow Traffic for Safe Testing

The most common use case is shadow traffic.

When migrating platforms or deploying a new version of an application, you can send real traffic to the new system, observe behavior, and validate responses.

All without impacting users. No risky cutovers. You see exactly how the new system behaves under real load.

2️⃣ Out-of-Band Traffic Inspection

Another powerful use case is traffic inspection.

Inline inspection is risky. It adds latency, introduces new failure points, and becomes part of the critical path.

With traffic mirroring, you can inspect traffic, analyze requests, and detect anomalies.

All without impacting the primary path.

😶‍🌫️ Reality Check

It’s not perfect. There is some overhead.

Mirroring adds load to the sidecar, which may or may not be acceptable for your system. In my experience, it’s negligible, but it’s something you should measure in your own environment before deploying to production.

🧠 Final Thoughts

Traffic mirroring is one of the safest ways to validate migrations, test new systems, and observe real production behavior.

The hard part isn’t mirroring traffic. It’s running two production systems in parallel. That’s the real cost, and the real tradeoff.

But if you can afford that cost, traffic mirroring is an incredibly powerful tool.

If you want to dig deeper:

Originally published at https://bencane.com on April 16, 2026.

Are you using traffic mirroring in production? If not, try it out was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

Agent Skills Are Becoming the Best Way to Capture Institutional Knowledge

Benjamin Cane — Thu, 09 Apr 2026 00:00:55 GMT

Photo by Rainhard Wiesinger on Unsplash

Use Agent Skills to capture institutional knowledge and make it usable by coding agents.

Every organization has institutional knowledge.

Internal frameworks
Preferred practices
Platform-specific capabilities

It exists everywhere. But it’s often undocumented… or buried in a wiki no one reads.

As coding agents take on more work, this problem gets worse.

If you ask an agent to build a new service, you want it to use your internal framework, follow your patterns, and respect your organizational constraints.

A human engineer would ask questions. An agent won’t, unless you give it that context.

📚 Agent Skills as Knowledge Distribution

Most people think about Agent Skills as actions:

Convert markdown to PDF
Review this pull request
Commit my changes

But the more interesting use case is guidance.

Skills aren’t just for doing things. They’re for shaping agent output.

Agents discover and use skills based on intent.

If a user asks: “Create a new Python service.”

The agent looks for relevant skills:

Language conventions (PEP 8, etc.)
Internal frameworks
Organizational standards

That’s where institutional knowledge belongs.

Instead of hoping engineers remember to tell the agent:

“We use Flask, not Django.”
“Stick to the standard library.”
“Follow this service layout.”

You capture that into a skill. The agent applies it automatically.

🧠 Why This Matters

Institutional knowledge only works if it’s:

Discoverable
Applied consistently

Agent Skills give you both.

They turn tribal knowledge into something agents can find, understand, and use.

⚠️ The Tradeoff (For Now)

Right now, this introduces duplication.

Most teams already have internal docs, style guides, & wikis.

And now you’re putting the same information into skills. Which feels like extra work.

But it poses an interesting question:

As agents become the primary interface… Will engineers read the wiki? Or ask the agent?

🧠 Final Thoughts

As agents take on more of the implementation work, where you store knowledge becomes more important. Making that knowledge accessible to agents becomes essential.

Agent Skills aren’t just automation tools.

They are becoming the interface for standards, practices, and institutional knowledge.

And teams that embrace that early will see more consistent output from both humans and agents.

Originally published at https://bencane.com on April 9, 2026.

Agent Skills Are Becoming the Best Way to Capture Institutional Knowledge was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

Saved Prompts Are Dead. Agent Skills Are the Future

Benjamin Cane — Thu, 02 Apr 2026 00:00:45 GMT

Photo by Onur Buz on Unsplash

Saved prompts are dead. Agent Skills are the next step.

If you’ve been around for a while, you probably have a file full of bash one-liners.

Small scripts or commands you saved because they solved a problem you didn’t want to automate properly.

When coding agents arrived, prompts became the new one-liners.

Useful prompts were saved, reused, and eventually turned into “prompt files”, then slash commands like /do-something.

But that model has already evolved.

⚙️ Agent Skills

Agent Skills are the next iteration.

At a basic level, a skill looks a lot like a saved prompt: a directory with a markdown file.

What makes it different is how it’s used.

Skills include metadata like name and description, allowing agents to discover them.

Instead of explicitly calling a prompt every time, the agent can determine when to use a skill based on intent.

This is referred to as progressive disclosure:

Agent loads skill metadata
Matches it to your task
Then loads and executes the full skill when needed

You can still call skills directly (/, $, @), but you don't always have to.

🧠 More Than Just Prompts

The real differentiator is that skills aren’t just prompts.

They can include reference documentation, templates, and scripts.

This means you’re no longer just telling the agent what to do.

You’re giving it tools and context to execute and validate tasks.

For more complex workflows, it’s often easier to write a script and teach the agent how to use it than to encode everything in a prompt.

⚠️ A Word of Caution

This power comes with risk.

Skills can include executable logic and tell agents to perform tasks.

That means a shared skill can contain malicious or unsafe behavior.

Treat them like any script you install:

Understand what they do
Know where they come from
Review before using (watch out for hidden text or obfuscated instructions)

🧠 Final Thoughts

Agent skills are a meaningful step forward.

They let you codify workflows, preferences, and repeatable agent tasks in a way that agents can discover.

They’re a strong productivity accelerator and a powerful way to capture institutional knowledge in a form agents can actually use.

(More on that in the next post.)

Originally published at https://bencane.com on April 2, 2026.

Saved Prompts Are Dead. Agent Skills Are the Future was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.

Generating Code Faster Is Only Valuable If You Can Validate Every Change With Confidence

Benjamin Cane — Thu, 26 Mar 2026 00:00:28 GMT

Photo by Alex on Unsplash

Generating code faster is only valuable if you can validate every change with confidence.

Software engineering has never really been about writing code. Coding is often the easy part.

Testing is harder, and many teams struggle with it.

As tools make it easier to generate code quickly, that gap widens. If you can produce changes faster than you can validate them, you eventually create more code than you can safely operate.

Which begs the question: What does good testing actually look like?

🔍 What Good Looks Like

One of the biggest challenges I see is that teams struggle to understand what “good” testing means and never define it.

Pipelines are often built early in a project, when the team is small, and they rarely keep pace with the system and organization as they grow.

My starting principle is simple:

At pull request time, you should have strong confidence that the change will not break the service or platform being modified.
Within a day of merging, you should have strong confidence that the change hasn’t broken the full customer journey that the platform supports.

🔁 On Pull Request

For backend platforms, I like to see three levels of automated testing before merging.

Code Tests (Unit Tests)

This level is the foundation. Unit tests validate internal logic, error handling, and edge cases. Techniques such as fuzz testing and benchmarking also reveal issues early. As the test pyramid tells us, this is where the majority of testing and logic validation should take place.

Service-Level Functional Tests

Too many teams stop at unit tests for pull requests. Functional tests should also be run in CI for every pull request.

Services should be tested in isolation with functional tests. Dependencies can be mocked, but things like databases should ideally run for real (Dockerized).

This is where API contracts are validated and regressions can be identified without wondering whether the issue came from this change or another service.

Platform-Level Functional Tests

Testing a service alone isn’t enough. Changes can break upstream or downstream dependencies. Platform-level tests spin up the entire platform in CI and validate that services interact correctly.

These tests ensure the platform continues to work as a system.

For platforms with strict latency or resiliency requirements, I recommend introducing light stress tests at both the service and platform levels. These aren’t full performance tests, but they act as early indicators of performance regressions.

If these three layers pass, you should have high confidence in the change. But not complete confidence.

🌙 Nightly Testing

Some failures take time to appear.

Memory leaks, performance degradation, and cross-platform integration issues may not show up immediately.

That’s why I like to run a nightly build (or every few hours).

This environment runs end-to-end customer journey tests, performance tests, and chaos tests.

These are typically the same tests used during release validation, but running them continuously accelerates feedback. If something breaks, you learn about it early, before the pressure of a release.

🧠 Final Thoughts

There is no universal approach everyone can follow.

Different systems have different needs; mission-critical systems may focus heavily on correctness and resilience. Non-mission-critical systems may focus more on validating core functionality.

Your testing strategy depends heavily on architecture, dependencies, and operational constraints. But if your organization is increasing its ability to generate code quickly, your testing capabilities must evolve at the same pace.

AI-generated code becomes much easier to review when you already have high confidence in your testing.

Originally published at https://bencane.com on March 26, 2026.

Generating Code Faster Is Only Valuable If You Can Validate Every Change With Confidence was originally published in ITNEXT on Medium, where people are continuing the conversation by highlighting and responding to this story.