Loop · Agentic QA · 2026

Do more with less in QA.

Reduce low-value testing. Apply AI where it actually compounds. Reposition QA around quality value instead of test execution.

Take the entry course Find your fit

AI-Native QE Readiness

Is your QA team set up for success in the age of AI?

Walk the full tiered checklist. Critical prerequisites first, then the eight readiness categories. Get a gap report your team can act on this week.

Take the readiness checklist

Who Loop is for

QA leaders being asked to do more with less.

If any of these sound like the conversation in your head this quarter, you're in the right place.

“My team is shrinking, but the regression suite isn't. We're drowning in flaky tests.”

. SarahQA Director at

Series-B fintech · ~50 engineers

We name the 30% of your suite eating 80% of CI minutes. And the move that gets it back without breaking confidence.

See the audit brief →

“Leadership keeps asking about AI. I don't yet have a defensible answer.”

. MarcusHead of Quality at

Healthcare SaaS · 120 engineers

We separate AI leverage from AI theater. And give you the boss-ready memo before you sign another vendor contract.

See the entry course →

“QA feels less valuable every quarter. I need to reposition the function.”

. PriyaQA Director at

E-commerce platform · 200+ engineers

We turn QA from test execution into the quality intelligence layer your CTO will forward to the board.

See the operating-model reset →

Names + companies anonymized at the speakers' request.

The bookshelf · 4 volumes

Read the system in full.

See all books →

Every methodology Loop runs is published as a book. Operating model, frameworks, assessments, adoption path. Free in full, gated only by an email so we can send the bundle.

Loop · Methodology

AI-Native Quality Engineering

QE-2026

AI-Native Quality Engineering

Operating model for the modern QA function.

Loop · Methodology

AI-Driven TDD

TDD-2026

AI-Driven TDD

Tests define intent. AI accelerates implementation.

Loop · Methodology

Bespoke Agentic Pipelines

AGT-2026

Bespoke Agentic Pipelines

The multiplier behind every Loop methodology.

Loop · Methodology

Policy as Code

POL-2026

Policy as Code

The deterministic floor beneath agentic development.

Live · Self-paced

Learn the system the way that fits.

Live · Public · WS-001

Doing More With Less in QA

One day. $1,000/seat. Walk out with a 90-day plan you can hand to your boss on Monday.

Last Tuesday of the month · Limited seats

Read the syllabus →

Coming soon

Self-paced courses

The same operating model packaged as on-demand tracks. First set is in production now. Drop your email on the rail and we'll send word when it lands.

4 tracks planned · Track 01 in production

See the courses landing →

Watch · Latest

From the channel

Subscribe on YouTube · @benfellows-dev →

May 26, 2026

Agentic development wasn't working for my large codebase. Then I implemented anchor tags

Agentic development works great until your codebase gets big. As the repo grows, the AI starts missing context. greps gets unreliable. Planning looks detailed but quietly skips important files. Validation becomes harder. And after enough agentic coding, you end up with random orphaned code scattered throughout the codebase. I spent months trying to solve this problem, and the thing that finally made a major difference was adding anchor tags throughout my codebase. In this video, I walk through what anchor tags are, why they help, and how I use them to make large-scale agentic development more reliable. The basic idea: anchor tags are metadata inside the codebase that give AI a deterministic, queryable system for understanding where things live, how features connect, and what needs to be included during planning and validation. Instead of asking AI to “go research the codebase,” we can point it toward a manifest, have it query relevant anchor surfaces, and then use normal grep/search on top of a much better starting point. This has helped me: - Improve planning accuracy in large codebases - Reduce orphaned and leftover legacy code - Validate refactors with more confidence - Link related code across services - Connect test coverage back to product surfaces - Give AI a better map of the repo without pretending it understands everything I also talk through how we pair anchor tags with policy-as-code rules, why the tag system needs to stay boring, and why this only works if the metadata is enforced consistently. This is not a perfect system, and I’m not claiming anchor tags magically solve agentic development. But for large codebases, they’ve been one of the most useful changes I’ve made. If you’re using AI coding agents on a large repo and running into context, planning, or validation issues, this is worth trying. Topics covered: - Why agentic development breaks down in large codebases - What anchor tags are - How anchor tags create deterministic codebase context - Why AI misses things even with large context windows - Using manifests and custom queries for planning - Validating deprecated features and refactors - Reducing orphan code - Pairing anchor tags with policy-as-code - Mapping tests to code surfaces - Practical rules for keeping anchor tags useful If you want the presentation or have questions about implementing this in your own codebase, drop a comment or reach out. Like and subscribe if you want more videos on agentic development, AI coding workflows, QA, automation, and building software with large language models.

Watch on YouTube →

May 4, 2026

Inside a Real Agentic Pipeline (Step-by-Step Breakdown)

Agentic pipelines sound great in clean demos, but what do they actually look like in production? In this video, I break down one of the real AI development pipelines I use almost every day: how it starts from a prompt, creates its own branch and worktree, runs research, builds a plan, gets reviewed by a second agent, writes failing tests, implements until green, runs policy checks, and produces receipts at the end. I also cover what’s worked, what’s been over-engineered, where deterministic checks matter, and why “just run more agents in parallel” is not always the right answer. Sorry for the lower-energy video, I hadn’t eaten all day before recording this one 😅 Links: Newsletter: https://tinyideas.ai/#newsletters QA work at Loop: https://www.workwithloop.com/ LinkedIn: https://www.linkedin.com/in/ben-f-44778426/ X: https://x.com/FellowsBen

Watch on YouTube →

May 1, 2026

Are Agentic Pipelines Actually Worth It?

Are agentic pipelines actually worth the extra time, tokens, and complexity? My honest answer: it depends. Agentic pipelines can improve accuracy, visibility, governance, and control, but they also add real cost. They often take longer to run, use more tokens, introduce more orchestration, and create another layer of abstraction around your development process. So the question is not “do pipelines work?” The better question is: did this pipeline earn its cost? In this video, I walk through the framework I’m using to evaluate whether an agentic pipeline is actually worth running. That includes measuring the pipeline tax, tracking run receipts, comparing quality improvements, and using a ledger system to understand whether a pipeline is making the work better or just making it more complicated. I also share an example of a pipeline that looked good on paper but probably wasn’t worth it in practice. That’s an important part of the lesson: not every task needs a pipeline. Sometimes a single Claude Code or Codex session, guided by a strong engineer, is enough. The goal is to use pipelines surgically. Start simple. Measure what happens. Add complexity only when the pipeline is solving a real problem. And when a pipeline gets too large, use the data to make it smaller. If you’re experimenting with agentic development, this video is about how to think about ROI, accuracy, governance, and cost before building complex AI workflows everywhere.

Watch on YouTube →

How to work with Loop

Five engagements. One operating philosophy.

See all services →

PPL-001People

Embedded QA

A senior, AI-fluent SDET joins your team and beachheads the new operating model by living it.

Read the brief →

AUD-001Audit

QA Leverage Review

An async-first private review of where your QA model is losing leverage. And what to do first.

Read the brief →

ALN-001Workshop

Quality Strategy & Leadership Alignment

Get QA, engineering, and product aligned on who owns quality and how release confidence gets measured.

Read the brief →

SPR-001Sprint

Quality Transformation Sprint

6–10 weeks. Move QA from test execution to release confidence with measurable ROI.

Read the brief →

Track record

What Loop's last year of engagements looks like in numbers.

30+

Engagements shipped

94%

On-time releases

−42%

Avg. regression CI minutes

Critical escapes (last 12 mo)

Numbers reflect engagements where Loop ran the operating-model reset or the transformation sprint. See the client roster for the full case set.

Writing

Loop on QA, AI, and engineering leverage.

Browse all essays →

Quality EngineeringMarch 14, 2026

The AI-Native QE Operating Model: Why Traditional QA Can't Keep Up

Most QA functions were built for a world where humans wrote all the code. That world is gone. Here's the operating model we've deployed across 30+ engagements to replace it.

8 min read →

Agentic PipelinesMarch 7, 2026

Why Bespoke Pipelines Beat Generic AI Agents Every Time

Everyone is shipping AI agents with giant prompts. The teams getting real leverage are building purpose-built pipelines with roles, permissions, and project-specific rules instead.

6 min read →

TDD DevelopmentFebruary 28, 2026

AI-Assisted TDD: The Workflow That Reduced Defect Escapes by 84%

We redesigned the TDD workflow for AI-assisted development. Tests define intent, AI accelerates implementation, and engineers own verification. The results changed how we think about quality ownership.

7 min read →

Resources

Templates, calculators, and guides we use with our clients.

Drop your email and we'll send the asset. No drip funnel, no sales calendar. One email, the file, and you're done.

TemplateEditable doc + Notion template

90-Day QA Leverage Plan

Coming soon

The exact week-by-week plan QA leaders use to defend headcount and prove output in a single quarter.

TemplateSheets + Looker Studio

QA Metrics Dashboard

Coming soon

Six metrics your CTO actually cares about. Escape rate, regression drag, recovery time, leverage ratio, AI yield, ownership clarity.

TemplateRACI worksheet

Quality Ownership Matrix

Coming soon

Stop QA-as-bottleneck. Map every test layer to a named owner so engineering can't push everything down to your team.

DiagnosticPersonalized PDF report

QA Leverage Scorecard

Coming soon

12 questions. Honest score. Tells you whether your team is a cost center, a guardrail, or a leverage multiplier.

DiagnosticQuarterly + annual loss model

Flaky Test Cost Calculator

Coming soon

Plug in your CI minutes, retry rate, and team size. Get the dollar figure flaky tests are costing you this quarter.

Guide32-page PDF, 25-minute read

The QA Director's Guide to Doing More With Less

Coming soon

How to keep release safety from collapsing when your team is shrinking and your scope is growing.

Engagement ladder

Start free. Move up the ladder when the work earns it.

01Read

Free books and resources. Evaluate Loop's thinking with no cost.

Open the bookshelf →

02Learn

Public courses and workshops. Walk out with a 90-day plan.

See the workshops →

03Diagnose

A private audit of where your QA model is losing leverage.

Read the brief →

04Engage

Embedded SDETs, the operating-model reset, or the transformation sprint.

See all engagements →

Three doors

Pick the one that matches where you are.

Read the books Take the entry course Find your fit