blogs

AI-Powered Testing Across the STLC: How SDETs Are Using Claude, Copilot, and LLMs in Every Phase

2026-05-27T22:00:00Z

How AI fits into every phase of the STLC: test planning, script generation, intelligent execution, and defect triage. Tools, human roles, and the golden rule of AI in QA.

The post AI-Powered Testing Across the STLC: How SDETs Are Using Claude, Copilot, and LLMs in Every Phase appeared first on Software Testing & Automation.

My rubber duck buddy — LLM use case - accelerate your thinking

2026-05-27T20:48:06Z

I’m a solo tester on a team that supports a vast number of products. One of my regular challenges is learning a product quickly and creating an effective test strategy around it.

To this day, I still use the one page test plan approach. I find it’s such an efficient way to organise your thinking and quickly capture what matters most.

Introduction — what the project is trying to achieve and why testing is involved
In scope — What I am going to be doing.
Out of scope — What I’m not going to test.
Risks & Assumptions
Environments & tools
Contacts & References (I have added this one as have found it useful in my context)

For more information, see the MoTaverse article https://www.ministryoftesting.com/insights/the-one-page-test-plan

One of my core principles is to make my work visible with the team. Before that I need to get my thinking in order. Recently I’ve been using AI Tools like Co-pilot to help brainstorm ideas for my test strategies. I find that the challenge is often getting those thoughts out of your head and onto paper in a structured way. LLMs have helped me accelerate this process.

For example say I’m testing a migration from one system to another, I can give the AI some context and ask things such as considerations, risks, assumptions, anything that could be overlooked.

From there I use my own experience and expertise to filter what’s relevant and pull the important points into my one page test plan.

I feel that experienced testers already have the answers somewhere in their heads. What helps is having a “rubber duck” to bounced ideas off — coaching questions or help brainstorm possibilities.

For me, I am concerned about those that are using AI to outsource their thinking. Here I am sharing an example of how it doesn’t replace your thinking, but helps accelerate YOUR own thinking.

My rubber duck buddy — LLM use case - accelerate your thinking

2026-05-27T20:48:06Z

I’m a solo tester on a team that supports a vast number of products. One of my regular challenges is learning a product quickly and creating an effective test strategy around it.

To this day, I still use the one page test plan approach. I find it’s such an efficient way to organise your thinking and quickly capture what matters most.

Introduction — what the project is trying to achieve and why testing is involved
In scope — What I am going to be doing.
Out of scope — What I’m not going to test.
Risks & Assumptions
Environments & tools
Contacts & References (I have added this one as have found it useful in my context)

For more information, see the MoTaverse article https://www.ministryoftesting.com/insights/the-one-page-test-plan

For example say I’m testing a migration from one system to another, I can give the AI some context and ask things such as considerations, risks, assumptions, anything that could be overlooked.

From there I use my own experience and expertise to filter what’s relevant and pull the important points into my one page test plan.

For me, I am concerned about those that are using AI to outsource their thinking. Here I am sharing an example of how it doesn’t replace your thinking, but helps accelerate YOUR own thinking.

Flaky Tests: The risk of dismissal without proof

2026-05-27T18:46:36Z

Patient is not how I’d describe myself. I was the kid who didn’t just look for the presents—I’d unwrap the gifts under the tree. I’d cut the tape, peek under the paper, carefully match up new tape. Then the one I was most excited about would be the last one I opened. I was able to wait until the end, knowing what it would be.

I started cheating at “surprises” early. I remember in school, playing a game where seven kids were chosen, and everyone else was meant to close their eyes and stick out their thumb. If yours was touched, when the time came, you’d have to guess who tapped you. I’d always cover my eyes with my arm and peek at the shoes running past.

I don’t like not knowing and having to guess. I like facts, data, information. I don’t like getting the answer wrong. I like being someone people see as a subject matter expert.

I have multiple ventures running right now. There’s the obvious: mom of two under five. A day job as a QA Director. And I have this newsletter. About 40k words of a novel that may never be anything. Submissions for flash fiction. A half-written memoir. Queries out for a picture book series. An app idea that splinters out into multiple possibilities to feed my creative thinking.

New things pop into my head. Interconnected, feeding off each other. But in my impatience, I want everything tomorrow. Instead of viewing them as building on each other—or even something that doesn’t have to be anything—I find myself trying to predict the future, abandoning ideas as failures, deciding it’s not how it should be.

Should is so dangerous.

A test is meant to pass or fail, given specific steps and an anticipated outcome. A flaky test sometimes passes, sometimes fails. No changes to the steps, the expectations, the underlying code or environment. Same exact test, but sometimes it’s green, sometimes it’s red.

It should work, but it doesn’t.

Flaky tests are unreliable. Untrustworthy. They give no useful signal. A critical bug could be at play in the code, but flaky tests get ignored because, hey, it passed all these times.

At work, I’m in a leadership program where we’ve been discussing our CliftonStrengths. My top five are Achiever, Learner, Maximizer, Intellection, and Strategic. Three words repeated with the most frequency across my report: driven, intense, selective. I’m wired to succeed, to see patterns, to optimize. I’m genuinely excited learning new things in depth. I’m always calculating ROI on my investments.

I’m driven enough to have all of this in motion at once. I’m selective enough to know not all of it will make the cut. I’m intense enough to analyze my life through a quality program.

But I’m trying to make a call on the test being flaky before there’s enough data to evaluate. Achiever wants the output. Maximizer wants it to be worth it. Learner and Intellection got me into all of it simultaneously. And Strategic is stuck feeding on noise and calling it signal.

I’m not bad at committing. I’m running the wrong tool for where I am.

One of those gifts I deemed special after peeking under the tree one year was a knitting kit from my mom.

That year, I knit scarves for all my friends and boasted about how I could make one in a day. Bragging about being fast in slow fashion. But eventually I stopped, getting too impatient with the time it took, the mistakes I made.

Later in life, I tried crochet. It works up faster, the stitches have more height. The year I learned, I made a few original designs. Set up an Etsy shop and sold some.

Another venture I started on a whim. And part of me wants to point at it as a failure. Started and then what? It didn’t become a crochet pattern empire, so failure.

Except I get sales still. My first pattern was a swimsuit cover-up, and I’ve been getting notifications this month, as summer approaches, of sales and favorites. I’ve never done anything to grow it. And occasionally, my Achiever side gets to smile and say, see, you made that, and someone actually paid for it.

I make my own designs still. I don’t write them up. I can’t peek behind the paper and see if I should put my time into it. Or if I should just make things and enjoy them.

I can’t tell if I should write for me or for an audience. If I should build an app as a portfolio piece or to create a business. If I should keep moving or hold still.

I don’t have enough signal. Everything’s a flaky test — sometimes it works, sometimes it feels good, sometimes it is work, sometimes it doesn’t light me up. My strengths scream at me to keep moving, but I know better than to ignore flaky tests.

They mean something is wrong. There’s a risk when your data is unreliable, untrustworthy.

But what needs fixing — the test or the product? My definitions of success or my actual life?

I’m still investigating to find out.

Bug reports and essays when the deploy is ready. Subscribe to follow the project.

Subscribe now

🖥️ Quality Nonsense: Put the Internet Back in the Living Room

2026-05-27T16:00:00Z

I was born in 1980, hanging on the tail end of what is now called Gen-X.

I grew up with a computer in my house. My dad and brother were on it all the time. I liked playing a game where you decorated a Christmas tree, and when finished, triggered a Christmas song. I wasn’t artistic enough to use MS Paint. Beyond that, I didn’t have much use for computers until a friend of mine introduced me to the internet when we were 17. We, of course, spent hours in chat rooms talking to complete strangers. We were so naive. What a time to be alive.

If you haven’t noticed, nostalgia for the 90s is raging. From fashion to entertainment, many of us are looking back at a time when cell phones did not exist, and computers mostly stayed in the family living room. Nostalgia has a powerful pull. And, yeah, if I were going to cheerfully relish any time period, it would be the 90s. But nostalgia, by itself, isn’t useful. I try not to sit there for long (but I enjoy the music while I’m there!).

Something has been nagging me for a while. In the 90s, we had something then that the majority of us would lose in the coming years, and it’s something we haven’t been able to regain since the invention of smartphones and social media: Distance. If I wanted to access the internet, I didn’t have to reach a few inches from my body. I, more than likely, walked to a different room in the house, sat at a desk, logged on to the internet, and had to search for notifications. The pings didn’t come to me. I came to them. Physical distance and time contributed to my experience on the internet. This is what I’m trying to add back into my life. And, I’m curious if this might be the key to help us regain a bit of sanity in a world overwhelmed 24/7/365.

Time

Ah, let’s talk about how long it took to get on the internet. Watching a few AOL clips online brought back some memories. Yes, there was the logging on part, but even turning on your computer took ages! Basically, to get online, you’d go to the living room, turn on the machine, get a snack, and then come back to log on to the internet. Oh, and if you were on the internet, you couldn’t use your home phone. So, if anyone needed to make a phone call, you were booted off. The restraints made every interaction feel precious, which I think is part of why we who remember feel so nostalgic for it today.

How could we mirror that today? Even though my phone is easily accessible, there are some self-restricted barriers I’ve added into my routine that have helped.

Finish important tasks before you reach for your phone in the morning. For me, that means I don’t get my phone until I have journaled, gotten dressed, and walked with my partner in the morning. I start the day with my own thoughts, and don’t get pulled by anything else except the house and the people I wake up with. After I get home and start breakfast, then I’ll get my phone, not before.

Download an app to manage or block your phone time. I use Minimalist Phone that helps reduce screen time. Friends of mine enjoy using Brick to help manage being online. Integrating self-enforced feedback loops that help you understand and manage your screen time will give you opportunities to think about how you want to manage and use your time.

Download an app that blocks phone access while driving. Lifesaver is a good one for this. I confess that I don’t use this as much as I should. But, when enabled, it blocks all access to your phone apps except maps. Actually, I’m enabling it right now. Okay, fixed.

Distance

Recently, I heard someone say that our addiction to our phones is not the same as the kinds of addictions people experience with drugs. If you remove a substance from an addicted person, their withdrawal symptoms and their craving can last for months, even years, but take someone’s phone away, and they won’t crave their phone in the same way. They will likely experience Nomophobia - the fear, anxiety, or distress associated with being without one's mobile phone or being unable to use it. We are so used to having our devices with us that we’re afraid to be without them. We are dependent on its presence. And again, the 90s can help us out here.

Recharge in different rooms. At night, put your phone in a different room from where you sleep. For me, that means the phone goes into my office before I go to bed. On my nightstand are books and anything else I want nearby when I settle in at night. It’s amazing how, after I’ve read for a while, put some lotion on my feet, and positioned myself, I have no desire to get up and look something up on my phone.

Access social media on your laptop. Yes, take social media off your phone. Feeling nervous already? I know I was. The FOMO (fear of missing out) hit hard at first. After a while, I didn’t care so much. I found other things to do because I can only feed my Hananezumi a certain number of times.

Still want to post? Use a buffer. And, yeah, I mean, like Buffer, an app that is a go-between for when you want to post on social media but don’t want to physically go to the site. So many of my social media posts never make it past the initial feed scroll because once I see what everyone else is posting, somehow what I came to post was not as good, helpful, witty, useful, or meaningful. Using Buffer, I don’t have to deal with any of those emotions. I think of something, I post to the intended platform, and log off. I’ll return a few days later to see if there were any reactions (there rarely are).

Unplug

I’m currently reading The Joy Diet by Martha Beck. One practice she offers that has helped me the most is getting quiet. Every day, I grab my coffee, my dog, a journal, and my phone (yes, my phone is helpful here). I set a timer for 20 minutes and sit in silence.

I watch the birds.

I listen to my thoughts.

I write helpful ones down.

I observe unhelpful thoughts.

I listen to the wind in the trees.

Time slows.

20 minutes feels like a long time.

20 minutes feels rejuvenating.

Yes, after a while, I get nervous and check my phone to make sure I haven’t missed the timer. And, literally, within seconds, the timer goes off. It’s taken time for me to relax and trust that the tools beside me will do their job. They do. After that 20 minutes of quiet, I take a few more minutes to jot down the thoughts I observed in the silence. I know myself better and can reflect on what I’m really feeling, without anything else muddying the waters.

This is more than a post about touching grass. This is a post meant to help us reconnect with ourselves. I don’t have good answers for you, let alone for myself. But I don’t think being online more will give us what we need. What we need is our humanity. The 90s were a messy time, and incredibly special. We can get stuck in nostalgia, or we can sift through our memories and pick out ways of living that benefit us.

And, if the dial-up internet connection sound comes back, I wouldn’t be sad about it.

Wide leg jeans forever!

Till next time…

All my posts are free to read, and clicking subscribe will bring each post to your inbox. If my work brings you joy, and you’d like to support it, you can become a subscriber by clicking the button above. You can also support my caffeine addiction writing by clicking the button below! Thanks!

Buy me a ☕️

Written with What are you pretending is okay? playing in the background

Disclaimer: The only bit of AI used in the writing of this article was my use of Grammarly. You don’t want to read what I write without it.

The Accidental Love

2026-05-27T05:39:20Z

A married man’s quiet struggle between loyalty and forbidden love unfolds through longing and unspoken emotions.

Continue reading on Medium »

The Accidental Love

2026-05-27T05:39:20Z

A married man’s quiet struggle between loyalty and forbidden love unfolds through longing and unspoken emotions.

Continue reading on Medium »

PromptFoo vs OpenEval: Benchmarking LLM Test Oracles for QA Engineers in 2026

2026-05-27T04:48:18Z

A head-to-head comparison of PromptFoo and OpenEval for benchmarking LLM test oracles. Real metrics, code examples, CI/CD integration, and cost analysis for QA teams.

The post PromptFoo vs OpenEval: Benchmarking LLM Test Oracles for QA Engineers in 2026 appeared first on Software Testing & Automation.

LangChain Plus Playwright: Automating End-to-End Tests with LLM Agents in 2026

2026-05-27T04:47:46Z

Learn how to combine LangChain and Playwright into a production-grade AI test agent. Includes planner, generator, executor pipeline with TypeScript code, self-healing logic, and India salary data.

The post LangChain Plus Playwright: Automating End-to-End Tests with LLM Agents in 2026 appeared first on Software Testing & Automation.

Your Team Doesn’t Need More Developers. It Needs Better Feedback Loops.

2026-05-27T03:46:00Z

Most engineering organizations assume slow delivery is a hiring problem.

Continue reading on The Testing Hub »

AI Test Agents Explained: The Planner-Generator-Healer Architecture for QA

2026-05-27T03:44:02Z

Learn how AI test agents use the planner-generator-healer architecture to write, run, and heal Playwright tests autonomously. Full TypeScript tutorial.

The post AI Test Agents Explained: The Planner-Generator-Healer Architecture for QA appeared first on Software Testing & Automation.

CI/CD Pipeline for Playwright Tests: From GitHub Actions to Kubernetes in 2026

2026-05-27T03:40:09Z

When your Playwright suite hits 400 tests and 47-minute runtimes, GitHub Actions alone is not enough. Here is the exact architecture to scale your Playwright CI/CD pipeline into Kubernetes in 2026.

The post CI/CD Pipeline for Playwright Tests: From GitHub Actions to Kubernetes in 2026 appeared first on Software Testing & Automation.

Gatling vs k6 in 2026: Which Modern Performance Testing Tool Should You Learn?

2026-05-26T22:00:00Z

Gatling vs k6 compared for 2026. Scala DSL vs JavaScript, enterprise vs startup, protocol support, CI/CD integration, and a decision framework for your team.

The post Gatling vs k6 in 2026: Which Modern Performance Testing Tool Should You Learn? appeared first on Software Testing & Automation.

Quality Logging

2026-05-26T21:33:42Z

Over the past year I’ve moved company and experienced two very different approaches to logging. Both of which I strongly dislike.

Value of logging

Some context on my opinions here. Throughout the majority of my career I have lent heavily on logs. This was most relevant when I spent 2 years in a role handling all support cases escalated to engineering. I really got to appreciate the value in being able to understand exactly WTF went wrong…

As well as leveraging them to try and understand the cause of issues, I’ve also found them extremely valuable in catching some of those “hidden” defects that can be a sign of nastiness. For example as a developer I implemented a connection method that wasn’t keeping connections alive correctly… but it was reconnecting. If I tested purely as a customer via the front end, it was fine. When I looked at the logs, I realised how nasty things were.

Subsequently as a tester I love to check the logs, even if everything looks OK. What gremlins are lurking in silence?

Another great test with logging is looking for information that shouldn’t be available. If I stick everything on debug level then can I read user passwords in plain text? As a user can I discover your implementation and tech stack through stack traces? This is especially relevant for software where the logs live on a person’s computer.

Logging Woes

Sadly my experienced with logging have been less than ideal over the past 3 years.

In my previous role I was overwhelmed by logs as there were hundreds of exceptions within an hour of moderate usage on top of thousands of other messages. Anything possibly slightly untowards at the point of implementation, throw an exception. In my current role I don’t have access to meaningful logs, if they even exist at all. I can perform an operation and it may or may not succeed and I’ve no idea why.

Clearly logging everything that happens to a text file isn’t manageable, especially in a complex system. Even with good layering of your logging (e.g. fatal, error, warn, info, debug), I’ve been in scenarios where we’ve had debug level logging over 24 hours to try and catch a scenario but the verbosity was so much that we couldn’t cover the full timespan.

One thing that I’ve liked is when logging can be customised, for example debug level on a specific class. Even better, if you’re using a dedicated logging solution then pointing at that and filtering after the fact is fantastic.

Takeaway Point

Logging is important but don’t over use it. The most important thing when it comes to logging is make sure you are making considered decisions.

If you’ve added extra log messages to help with (dev) testing, remove them immediately afterwards. If you’re working on a new user flow and haven’t added any logging, ask yourself how you will debug it? Is it clear who has done what? When things go wrong, can you piece together enough to understand why?

And importantly… whether you’re developing or testing a feature, when the user (inevitably?) does something completely unexpected down the line, will you be able to quickly understand what happened?

Note taking: AI is not going to save you from a poor test strategy

2026-05-26T19:12:31Z

One thing I keep thinking about is how AI is not going o save you from a poor test strategy. In fact, the consequences of an incomplete testing approach become exponentially more damaging as teams move faster with AI.

Gaps in your testing strategy will hit hard — security vulnerabilities, compliance failures, product regressions, customer dissatisfaction and more. We still see countless examples of software failing badly. AI hasn’t changed that. AI amplifies what you already have. Poor test strategy. Poor outcomes.

With AI it feels we are wanting to do more complicated stuff and be faster — which increases the risk of outsourcing our thinking and waving over the fundamentals that matters.

What levels of testing are you going to do? The testing pyramid (unit, integration, end to end) is still a rough guide to get your thinking right.

Where are you going to test — API / between systems? Through logs and observability? In production? Somewhere else?

Who are your users — admin / non -admin / internal or external users?

Why are you testing — Security? Compliance? Legal risk? Customer trust? Reliability? Performance?

The people that have been in the trenches — outages, incidents, scaling problems, broken deployments — are still the people that add huge value to companies.

The technology will keep evolving. A solid testing strategy will not change that much.

Note taking: AI is not going to save you from a poor test strategy

2026-05-26T19:12:31Z

With AI it feels we are wanting to do more complicated stuff and be faster — which increases the risk of outsourcing our thinking and waving over the fundamentals that matters.

What levels of testing are you going to do? The testing pyramid (unit, integration, end to end) is still a rough guide to get your thinking right.

Where are you going to test — API / between systems? Through logs and observability? In production? Somewhere else?

Who are your users — admin / non -admin / internal or external users?

Why are you testing — Security? Compliance? Legal risk? Customer trust? Reliability? Performance?

The people that have been in the trenches — outages, incidents, scaling problems, broken deployments — are still the people that add huge value to companies.

The technology will keep evolving. A solid testing strategy will not change that much.

How to Perform Response Verification in REST-Assured Java for API Testing: Part 2

2026-05-26T13:33:59Z

Master REST-Assured response verification in Java with Hamcrest Matchers, JSON assertions, API validations, and real-world examples.

Continue reading on Javarevisited »

How to Perform Response Verification in REST-Assured Java for API Testing: Part 2

2026-05-26T13:33:59Z

Master REST-Assured response verification in Java with Hamcrest Matchers, JSON assertions, API validations, and real-world examples.

Continue reading on Javarevisited »

8 Types of API Tests Mapped to the Right Architecture Layer: Where Each Lives and Why

2026-05-25T22:00:00Z

8 types of API tests mapped to architecture layers. Where each lives, when to run it, and what it catches. The systems-thinking approach SDET interviewers want to see.

The post 8 Types of API Tests Mapped to the Right Architecture Layer: Where Each Lives and Why appeared first on Software Testing & Automation.

No Workroom PlayTime 28 May

2026-05-25T21:43:07Z

Hi all – there's no Workroom PlayTime this Thursday. Unexpected family commitments mean that I no longer have the slot on the Thursday, there's no time to spare on the Friday, and I'm too late in the week to usefully pull it sooner.

We'll do Sitegeist as an Exploratory Interface, and will focus on using an LLM-connected browser extension to help us to explore.