Buying guides
Why general models keep beating specialized AI
There is a pattern in this industry that repeats often enough to plan around. A team builds a carefully specialized model for one domain: legal drafting, medical coding, support triage. It beats the general model, deservedly, and for a while it is genuinely ahead. Then the next general frontier model ships, trained on everything and nothing in particular, and matches or beats the specialist with no domain training at all. The lead that took two years to build is gone in one release.
This is not bad luck. It is the bitter lesson that Rich Sutton named for AI research: general methods that ride raw scale keep beating hand-crafted, domain-specific cleverness, and they win by a wider margin every year. It is now playing out in the market for AI products, and it changes how you should buy.
Two very different things get called "specialized"
Before deciding, separate two things that wear the same label.
The first is a thin wrapper: a prompt, some fine-tuning and a nice interface on top of the exact frontier model everyone else can rent. The specialization is real today and gone the day the base model improves, because the base model is doing the work and the base model is not yours.
The second is a genuine system: proprietary data nobody else has, an encoded workflow, an eval suite tuned to the domain's failure cases, guardrails that satisfy a regulator, and human operators closing the last mile. Here the model is one swappable part, and the moat is everything around it.
Both get sold as "specialized AI for your industry." One is about to be commoditized by a general release. The other gets better every time the general models improve, because it rides them. Telling them apart is most of the buying decision.
Why the general model keeps winning
The economics are lopsided in a way that is easy to miss. The general frontier model is improved by thousands of the best researchers in the world, on a budget of billions, against every benchmark at once. A vertical model is improved by one company, on one budget, against one domain. Every few weeks the general side ships another release and resets the comparison. You are not betting against this quarter's vertical model; you are betting against the entire frontier's compounding rate, and that is not a bet the specialist wins for long.
There is a technical reason too. A great deal of what looked like it needed domain training turns out to be reachable by a general model given the right tools, the right retrieved context and a good harness. Retrieval over your documents plus a frontier model frequently matches a domain fine-tune, at lower cost, and it upgrades for free with the next model. Specialization bought with data plumbing beats specialization baked into weights, because plumbing is portable across models and weights are not.
When specialized is the right buy
None of this says avoid vertical products. It says know what you are paying for. Buy the specialist when the advantage is something a general model cannot conjure:
- Proprietary data. A corpus, a labeled history, a feedback loop nobody else has. Models do not come with your data.
- An encoded workflow. The system of record integrations, the exceptions, the approvals, the operating model. The general model can reason; it cannot know how your business actually runs.
- Compliance scaffolding. Evals, audit trails and controls tuned to a regulated domain, which are expensive and slow to build and rarely the thing a horizontal tool bothers with.
- Human operations. Trained reviewers who handle the cases the model should not, and turn every correction into training data.
Every item on that list is a harness or a dataset or an operation. None of it is "we trained a special model." That is the tell.
How to buy accordingly
When a product is pitched to you as specialized AI, ask one question in four forms. What happens to your product when the next general model ships: does it get better, or does it get eaten? Is the moat the model, or the data and workflow around it? Could a competent team assemble the same wrapper on a general model in a quarter? And if the value is real, is it the kind that compounds, or the kind that resets?
If the answers point to a thin wrapper, do not pay a premium for a lead the next release will erase; build the same thing on the general frontier and keep the savings. If they point to real data, a real workflow and a real operation, that is worth paying for, because that is the part no model can copy. The durable question was never which model is most specialized. It is who owns the system around it.