Anthropic acknowledged late April what Claude Code users had been complaining about for weeks. The product had been silently degrading. Product-layer changes around reasoning, caching, and prompt behavior had drifted in ways the people running it did not catch in time.
That is the everyday condition of every system old enough to have its meaning live off-disk. Claude Code is young. The system around it is young. But both are now old enough to have a brownfield problem of their own.
If the people building the agent are vulnerable to it, the people pointing the agent at twelve-year-old codebases are not safe.
Software people call the empty page greenfield. Nothing is in the ground yet. The model can invent freely because there are no old promises to keep.
Brownfield is the opposite. The system already exists. It earns revenue. Customers depend on it. Operations has worked around its scars. Finance trusts its reports. Someone once made a strange decision for a good reason, and the good reason has long outlived the person who remembers it.
Coding agents are strongest in greenfield work. Brownfield is where the tax arrives.
Somebody on your engineering team is going to test that on Monday morning. They want a win. The agent is fast. The demo will look good. They will point it at the oldest, ugliest system in the tree, because that is where the visible win lives.
Take a load forecasting model. Twenty thousand lines of Python and SQL. Also a notebook of overrides nobody opens, an Excel file emailed in 2019 that some obscure part of the process still references, and one adjustment that exists because a regulator asked for it after a storm.
The code is the visible half. The meaning lives in the rest.
That is what makes brownfield work hard for agents. Not just missing documentation. Missing reality.
A software system is a treaty. Between code and customers. Between finance and operations. Between the official process and the way work actually gets done. Between a vendor API that lies twice a year and the person who knows when to ignore it. Between the clean data model in the diagram and the old rows in the database that will never be clean.
The agent sees the text of the treaty. It may not understand the peace it preserves.
So it does what agents are good at. It makes the code cleaner. It normalizes the strange exception. It replaces a brittle path with something that looks more modern. Tests pass. The change is coherent. The merge request reads well.
Then February arrives.
The forecast is wrong in the territory where the regulator cared most. Or the invoice goes out without the manual adjustment Sarah used to catch. Nobody sees the break at merge time, because the break is not in the code. It is in the agreement the code was quietly honoring.
This is the brownfield tax.
The research is starting to catch up with the intuition. Carnegie Mellon researchers found that AI coding tools produced early speed gains that faded as quality problems accumulated. A separate large study of AI-authored commits found measurable quality issues across every assistant studied. The agent did not slow down. Review did.
That is the CTO problem. Not whether agents can change old systems. They can. The problem is whether the organization can still reconstruct why the system behaves the way it does after agents start changing it.
Most brownfield systems have three kinds of knowledge: what the system does, how work moves through it, and why the strange parts still exist. Agents are increasingly good at the first. They can often infer the second. They are weakest at the third, because the third usually lives outside the code repository.
The serious modernization work already follows this shape. IBM’s mainframe modernization work, for example, does this by building an application blueprint, explaining COBOL, PL/I, assembler, and job-control logic in human language, tracing relationships between programs and batch jobs, and validating that the transformed system still means what the old one meant. That is a different review surface: data movement, business meaning, runtime behavior, and evidence that the new thing still behaves like the old thing where it must.
Brownfield agent work outside the mainframe needs the same pattern. Before the agent changes the treaty, it has to show what treaty it thinks it is changing.
What behavior changed? What behavior was deliberately preserved? What source of truth proves the non-obvious rule?
Sometimes the source of truth is a test. Sometimes it is a customer contract, a regulator's email, a data warehouse query, or the person who has run the process for seven years. The point is to make the agent prove it has found the ground before it starts digging.
Every CTO has heard some version of this before, usually as a plea for better documentation. They have rationally ignored it. The labor was expensive. The cost of skipping it was diffuse. Agents change both sides: they make the record cheaper to draft, and the missing evidence more expensive to tolerate.
Better models will help. They will not remove the tax.
The information that matters most was never in the training data, because it was never written down anywhere useful. A regulator's letter. Sarah's memory. A vendor's bad habit. Better models without that context are just more confident against an opaque system.
That is worse than slow.
Here is the contract. The agent changes the cost of writing code. It does not change what the CTO signs their name to. The agent owns nothing.
Pay the brownfield tax now, as the work happens. Or pay it later in incidents.