Writing

The Revision Problem

You'd expect that coding tasks that pass design review on the first shot would be the best ones. My data says the opposite is true.

My agentic coding pipeline is standard SDLC: plan, design, code, deploy. I have gates between each stage to verify the artifacts are correct. I was reviewing the task success/fail data and I noticed something odd: tasks that failed the design review and went through the subsequent revision cycle had a lower failure rate in subsequent gates. They were better.

This is also true at the plan level: when a plan gets rejected and revised, only 22.7% of those tasks fail design review later. Plans that passed on the first try? 43.8% fail at design.

The revision gate isn't just "fix it". The agentic reviewer provides feedback about why it is wrong and suggests ways to improve it. It forces the agent to "think harder" about the entire problem, and the benefit of those tokens cascades through all the downstream stages.

Tasks that pass first go, without feedback, may actually have latent issues that are only caught by later gates.

Of course, the gates can only check that the artifact meets the specification. They can't verify the spec matches what I actually wanted. In my own workflow, I address this by planning interactively.

I typically start planning in Claude Code, as a conversation. We do background research to understand the current context, we explore different solutions, and eventually I tell it to make the plan. When it runs it through the gate, I'm still at the keyboard and it can and does escalate questions to me.

This cycle ensures the plan that ultimately passes the gate matches the intent in my head. This is where the bulk of my engineering work goes now.

The most expensive thing a pipeline can do is let bad work through early. A rejected plan that gets revised produces better code than a plan that passed on the first try.

The framework behind this: Trust Topology →