Harness Engineering

June 3, 2026

I keep seeing "prompt engineering" used for two completely different things, and I think it's worth pulling them apart.

One is what most people mean by it: you sit at ChatGPT or Claude, type a prompt with context, examples, constraints, and iterate until the output is what you wanted. That's a real skill. Large models have saved vast quantities of human knowledge in their weights and we use artful prompts to shine the light on the bits that are useful to our particular task.

The other is something else entirely. You're not writing a prompt, you're building a pipeline. There are stages, and each stage produces an intermediate artifact. Sometimes an agent produces it, sometimes a deterministic process produces it, sometimes a blend of both. The skill is separation of concerns between the stages, ensuring the correctness of each artifact as it moves through, and reasoning about where the failure modes live.

The first is prompt engineering. The second is harness engineering.

Both involve sending text to a model. But the underlying skills are different. Prompt engineering is closer to writing and rhetoric. Harness engineering is closer to distributed systems design.

A lot of the online discourse runs the two together, which has the predictable result of confusion and flame wars. Writing a great prompt is hard. Designing a harness that produces reliable output from unreliable components is also hard. They aren't the same kind of hard.

Different skills. Different tools. Probably different teams.

The framework behind this: Trust Topology →