Michael RothrockWriting

Verification Debt

Someone recently asked me how to stop Claude from degrading their codebase. Not by writing bugs--it writes great code. Instead, they were talking about the slow "rot" that came as it chose different patterns over time.

Some of this is just the evolution of the model: Opus 4.8 may make different choices than Opus 4.5. But we, the humans, have to live with the code over time. We establish the patterns and choose the frameworks that meet the needs of the organization. This often happens when it chooses shortcuts: an odd choice instead of the framework-fluent path; glue code to make a feature work. The build passes, but the code starts to smell.

There's an emerging name for this: verification debt.

The instinct is to add more checks over the code. More reviewers, more lint rules, more tests. This is necessary, but not sufficient. The code lints clean and the tests are green but it still has a smell that is rooted in design. And it's hard for an output check to see a design problem.

The answer is to test the right thing: gate the intermediate artifacts in addition to the final output. The pipeline generates a plan, a design, code, and tests, each checked before it moves to the next stage. Earlier stages lean more on LLM reviewers: is this plan consistent with the existing architecture; does the design over-engineer the task. We augment this with deterministic tests where they fit, and humans where the stakes demand it.

At the end of the pipeline, our code checks don't have to catch plan or design issues. Those are found and fixed at the cheapest point.

Verification debt isn't debt you pay down. It's a signal that your pipeline verifies in the wrong place. The reliability was never going to come from the code. It comes from the pipeline that produced it.

The framework behind this: Trust Topology →