Loop · Agentic QA · 2026

Do more with less in QA.

Reduce low-value testing. Apply AI where it actually compounds. Reposition QA around quality value instead of test execution.

AI-Native QE Readiness

Is your QA team set up for success in the age of AI?

Walk the full tiered checklist. Critical prerequisites first, then the eight readiness categories. Get a gap report your team can act on this week.

Who Loop is for

QA leaders being asked to do more with less.

If any of these sound like the conversation in your head this quarter, you're in the right place.

01
“My team is shrinking, but the regression suite isn't. We're drowning in flaky tests.”
. SarahQA Director at

Series-B fintech · ~50 engineers

We name the 30% of your suite eating 80% of CI minutes. And the move that gets it back without breaking confidence.

See the audit brief
02
“Leadership keeps asking about AI. I don't yet have a defensible answer.”
. MarcusHead of Quality at

Healthcare SaaS · 120 engineers

We separate AI leverage from AI theater. And give you the boss-ready memo before you sign another vendor contract.

See the entry course
03
QA feels less valuable every quarter. I need to reposition the function.”
. PriyaQA Director at

E-commerce platform · 200+ engineers

We turn QA from test execution into the quality intelligence layer your CTO will forward to the board.

See the operating-model reset

Names + companies anonymized at the speakers' request.

Watch · Latest

From the channel

Subscribe on YouTube · @benfellows-dev
Agentic development wasn't working for my large codebase. Then I implemented anchor tags

May 26, 2026

Agentic development wasn't working for my large codebase. Then I implemented anchor tags

Agentic development works great until your codebase gets big. As the repo grows, the AI starts missing context. greps gets unreliable. Planning looks detailed but quietly skips important files. Validation becomes harder. And after enough agentic coding, you end up with random orphaned code scattered throughout the codebase. I spent months trying to solve this problem, and the thing that finally made a major difference was adding anchor tags throughout my codebase. In this video, I walk through what anchor tags are, why they help, and how I use them to make large-scale agentic development more reliable. The basic idea: anchor tags are metadata inside the codebase that give AI a deterministic, queryable system for understanding where things live, how features connect, and what needs to be included during planning and validation. Instead of asking AI to “go research the codebase,” we can point it toward a manifest, have it query relevant anchor surfaces, and then use normal grep/search on top of a much better starting point. This has helped me: - Improve planning accuracy in large codebases - Reduce orphaned and leftover legacy code - Validate refactors with more confidence - Link related code across services - Connect test coverage back to product surfaces - Give AI a better map of the repo without pretending it understands everything I also talk through how we pair anchor tags with policy-as-code rules, why the tag system needs to stay boring, and why this only works if the metadata is enforced consistently. This is not a perfect system, and I’m not claiming anchor tags magically solve agentic development. But for large codebases, they’ve been one of the most useful changes I’ve made. If you’re using AI coding agents on a large repo and running into context, planning, or validation issues, this is worth trying. Topics covered: - Why agentic development breaks down in large codebases - What anchor tags are - How anchor tags create deterministic codebase context - Why AI misses things even with large context windows - Using manifests and custom queries for planning - Validating deprecated features and refactors - Reducing orphan code - Pairing anchor tags with policy-as-code - Mapping tests to code surfaces - Practical rules for keeping anchor tags useful If you want the presentation or have questions about implementing this in your own codebase, drop a comment or reach out. Like and subscribe if you want more videos on agentic development, AI coding workflows, QA, automation, and building software with large language models.

Watch on YouTube →
Inside a Real Agentic Pipeline (Step-by-Step Breakdown)

May 4, 2026

Inside a Real Agentic Pipeline (Step-by-Step Breakdown)

Agentic pipelines sound great in clean demos, but what do they actually look like in production? In this video, I break down one of the real AI development pipelines I use almost every day: how it starts from a prompt, creates its own branch and worktree, runs research, builds a plan, gets reviewed by a second agent, writes failing tests, implements until green, runs policy checks, and produces receipts at the end. I also cover what’s worked, what’s been over-engineered, where deterministic checks matter, and why “just run more agents in parallel” is not always the right answer. Sorry for the lower-energy video, I hadn’t eaten all day before recording this one 😅 Links: Newsletter: https://tinyideas.ai/#newsletters QA work at Loop: https://www.workwithloop.com/ LinkedIn: https://www.linkedin.com/in/ben-f-44778426/ X: https://x.com/FellowsBen

Watch on YouTube →
Are Agentic Pipelines Actually Worth It?

May 1, 2026

Are Agentic Pipelines Actually Worth It?

Are agentic pipelines actually worth the extra time, tokens, and complexity? My honest answer: it depends. Agentic pipelines can improve accuracy, visibility, governance, and control, but they also add real cost. They often take longer to run, use more tokens, introduce more orchestration, and create another layer of abstraction around your development process. So the question is not “do pipelines work?” The better question is: did this pipeline earn its cost? In this video, I walk through the framework I’m using to evaluate whether an agentic pipeline is actually worth running. That includes measuring the pipeline tax, tracking run receipts, comparing quality improvements, and using a ledger system to understand whether a pipeline is making the work better or just making it more complicated. I also share an example of a pipeline that looked good on paper but probably wasn’t worth it in practice. That’s an important part of the lesson: not every task needs a pipeline. Sometimes a single Claude Code or Codex session, guided by a strong engineer, is enough. The goal is to use pipelines surgically. Start simple. Measure what happens. Add complexity only when the pipeline is solving a real problem. And when a pipeline gets too large, use the data to make it smaller. If you’re experimenting with agentic development, this video is about how to think about ROI, accuracy, governance, and cost before building complex AI workflows everywhere.

Watch on YouTube →

Track record

What Loop's last year of engagements looks like in numbers.

30+

Engagements shipped

94%

On-time releases

−42%

Avg. regression CI minutes

0

Critical escapes (last 12 mo)

Numbers reflect engagements where Loop ran the operating-model reset or the transformation sprint. See the client roster for the full case set.

Resources

Templates, calculators, and guides we use with our clients.

Drop your email and we'll send the asset. No drip funnel, no sales calendar. One email, the file, and you're done.

TemplateEditable doc + Notion template

90-Day QA Leverage Plan

Coming soon

The exact week-by-week plan QA leaders use to defend headcount and prove output in a single quarter.

TemplateSheets + Looker Studio

QA Metrics Dashboard

Coming soon

Six metrics your CTO actually cares about. Escape rate, regression drag, recovery time, leverage ratio, AI yield, ownership clarity.

TemplateRACI worksheet

Quality Ownership Matrix

Coming soon

Stop QA-as-bottleneck. Map every test layer to a named owner so engineering can't push everything down to your team.

DiagnosticPersonalized PDF report

QA Leverage Scorecard

Coming soon

12 questions. Honest score. Tells you whether your team is a cost center, a guardrail, or a leverage multiplier.

DiagnosticQuarterly + annual loss model

Flaky Test Cost Calculator

Coming soon

Plug in your CI minutes, retry rate, and team size. Get the dollar figure flaky tests are costing you this quarter.

Guide32-page PDF, 25-minute read

The QA Director's Guide to Doing More With Less

Coming soon

How to keep release safety from collapsing when your team is shrinking and your scope is growing.

Three doors

Pick the one that matches where you are.

Template

90-Day QA Leverage Plan

Coming soon