Agentic AI

The model is a commodity. The harness is the product.

Here is a test that settles most arguments about AI strategy. Take a system you are proud of and swap the model underneath it for a different vendor's. If it is built well, that takes an afternoon and almost nothing changes. Now try to swap the harness: the tools, the retrieval, the evals, the guardrails, the control loop. That is not an afternoon. That is the system.

If the model can be replaced in an afternoon and the harness cannot, then the harness was the product all along. The model is a rented commodity. Whatever you pick, an open-weight release will match it for your use case within months, at a fraction of the price. The thing you actually own, the thing that carries your advantage, is everything wrapped around it.

What the harness actually is

"Harness" is the unglamorous word for the operating layer that turns a model into a system that does real work. It is where all the engineering lives:

demolooked greatpilotstalls herethe gapevals · guardrailsmonitoring · humansproductionit ships
Most pilots die in the gap: the operating layer around the model, not the model itself.
  • Tools and permissions. What the agent can touch, through which interfaces, with what limits. Connected over frameworks like OpenClaw, Hermes Agent, LangGraph and the Model Context Protocol, and scoped to least privilege.
  • Retrieval and memory. Which of your data reaches the model, when, and how it is kept fresh. This is where most quality is actually won.
  • The control loop. Planning, sub-tasks, retries, budgets and stopping conditions. The difference between an agent that converges and one that spirals.
  • Evals. A golden set of your real cases with known answers, run on every change, so quality is a graph instead of a feeling.
  • Guardrails. Deterministic checks before irreversible actions, cost and rate ceilings, and the confidence routing that sends uncertain cases to people.
  • Observability. Traces, costs and outcomes on a dashboard, so you can see what the system did and what it spent.

Notice what is not on that list: the model. The model is one swappable component inside a much larger machine, and it is the only component you did not build.

Why the harness is the durable asset

Models converge. They leapfrog each other every few weeks, the gap between the best closed model and the best open one keeps shrinking, and any advantage you get from "we use the smartest model" evaporates the moment your competitor signs up for the same one. Betting your strategy on which model is ahead this quarter is betting on a lead measured in weeks.

The harness does the opposite. It accumulates. Every failure case you turn into an eval, every workflow you encode, every guardrail you add after something went wrong, every correction a reviewer makes, all of it compounds into an asset that is specific to your business and cannot be downloaded. Your data, your workflows and your judgment are exactly what a general model cannot copy, and the harness is where they live. That is why we tell clients the honest version: we are not selling you access to a model you could rent yourself. We are building the system that makes the model useful on your problem, and keeps it useful when the model changes.

The swap dividend

A good harness is model-agnostic on purpose, and that is not a hedge. It is a dividend. When a stronger or cheaper model ships, a well-built system adopts it by changing a configuration line and re-running the eval suite. If quality holds and cost drops, you keep it; if not, you roll back before anyone notices. Model progress, which terrifies teams who bet on one vendor, becomes a stream of free upgrades for teams who bet on the harness.

Teams that hard-wired themselves to a single model experience the same releases as migrations: risky, manual, and always overdue. Same industry, opposite sign, decided entirely by where they put their engineering.

What this means when you build or buy

If a vendor's pitch is "we use the best model," they are selling you the one component that is commoditized, rented and about to be matched. Ask instead about the harness. What is in the eval suite and who owns it? How does the system route its uncertain cases to a human? What happens on the day a better model ships: a config change, or a project? Can you take the harness with you if you leave?

The next few years of this industry will not be won by whoever briefly has the smartest model. Everyone will have a smart model. They will be won by whoever built the best harness around it. Rent the model. Own the harness.

views
Manish KumarPrincipal Engineer, AI SystemsBuilds and hardens agentic and LLM systems at extendfuture: evals, guardrails, tool permissions and the monitoring that keeps them safe in production. Manish Kumar on LinkedIn

Working on something in this territory?

Tell us what you are trying to win. We answer within one business day, from the people who build.