Anthropic Kept Every Promise It Could Afford

Jun 09, 2026

In April I wrote that “responsible AI company” was always a market position, not a moral one. I said that at the speed Anthropic was growing, the distinction between the two was never going to survive. That was a prediction dressed as a closing line, and I half meant it as rhetoric.

Six weeks later I have the receipt, and I am not happy about it.

The last binding commitment Anthropic ever made was already gone by the time I wrote that. Between that article and this one, the company raised at a valuation that put it ahead of OpenAI, filed to go public, and then published a warning that the technology might be getting too dangerous to keep building. In that order. I predicted the destination. I did not expect to watch it drive there this fast, narrating the whole way.

So instead of another closing line: the dates, the documents, the amounts.

The promise, and what replaced it

In September 2023, Anthropic published the first version of its Responsible Scaling Policy. The document made one commitment that actually bound the company: it would pause development if its models outran its ability to keep them safe. Everything else in the policy described process. That one line described a brake. At the time, the company was valued around $4 billion.

In February 2026, Anthropic published version 3.0 of the same policy. The brake was gone. In its place: a set of “Frontier Safety Roadmaps,” which the company describes as goals it will publish and grade itself against. The single line that could have prevented a release was replaced with one that documents it.

Anthropic did not hide this, and its chief science officer explained the reasoning to Time. Jared Kaplan said the company no longer felt unilateral commitments made sense “if competitors are blazing ahead.” He argued the change was actually a renewed commitment to safety, on the logic that one company pausing while the rest of the industry sprints does not make the world safer. He is not wrong about the logic. That is the part worth sitting with. The most safety-focused company in the industry looked at its own founding promise and removed it. Not because anyone forced them. The race made the brake a liability, and the brake came out. That is what should worry you: the reasoning was sound, no one had to lie, and the safest commitment anyone in the field had made still did not survive once it became a disadvantage.

That is the whole thesis of my April article, except now it is in their changelog rather than in my opinion column.

The numbers that arrived on schedule

The same month it revised the policy, Anthropic raised $30 billion at a $380 billion valuation. That round closed on February 12.

On May 28, the company announced a further raise: $65 billion at a $965 billion post-money valuation. The number is worth reading twice. It put Anthropic ahead of OpenAI, which sat around $852 billion, making the research lab that branded itself on caution the most valuable AI company on earth.

Anthropic confirmed on June 1 that it had submitted a confidential draft registration to the SEC for an initial public offering. In April, an IPO was rumored for October. Now it is a filing. The company is careful to say the offering depends on market conditions and that nothing is set. The document is real regardless.

Run the dates next to each other and a sequence appears on its own. The binding safety commitment existed when the company was worth $4 billion. It survived to $183 billion in September 2025. It was removed at $380 billion. The IPO paperwork arrived at $965 billion. I am not claiming the valuation caused the policy change, or that anyone sat in a room and traded one for the other. I cannot see inside the company and neither can you. I can see the dates.

The same man, the same line, seven years apart

There is one person who connects the bookends of this story, and following him is more useful than guessing at anyone’s motives.

In February 2019, OpenAI announced it had built a language model called GPT-2 that it considered too dangerous to release in full. The company withheld the complete model and let it out in stages over the rest of the year. The strategy and the public case for it came out of OpenAI’s policy team, run by its policy director, Jack Clark. Dario Amodei led the research team that built it. I credited Amodei with the decision to hold it back when I wrote about this in April. That was sloppy. He built the model; the call not to ship it was the policy team’s, and Clark made the public case for it.

Clark co-founded Anthropic in 2021.

In April 2026, Anthropic announced Mythos, a model it said could find thousands of unpatched security holes across every major operating system and browser. Too dangerous for public release. The company put it behind a limited program for around forty companies instead. An unauthorized group reached it the same day it was announced, using a contractor’s access to one of those third-party vendor environments. Too dangerous for the public, open on day one to anyone who found the door.

Then, on June 4, three days after the IPO filing, Anthropic published a report titled “When AI builds itself.” Marina Favaro and Jack Clark wrote it. The report says AI is now accelerating AI development, that more than 80 percent of the code the company ships is written by its own model, and that the world should preserve the option to slow down before the technology runs ahead of our ability to govern it. The slowdown it proposes is conditional. Anthropic would pause, the report says, only if competitors at the frontier verifiably did the same.

So the policy director who explained why GPT-2 was too dangerous to release in 2019 co-wrote the argument for slowing down frontier AI in 2026, three days after his company filed to go public, while that company’s own dangerous model was already out the door. I do not think Clark is cynical. He keeps arriving at the same honest concern. It keeps hitting the same competitive wall. The view from the wall has just gotten more expensive.

The conditional is the part I keep turning over. A pledge to slow down only if every other frontier lab verifiably does the same is a pledge that never has to be kept, because one of them never will. You could call that caution. To me it reads as a company turning its own broken word into a fact about the industry rather than about itself: we would hold the line; the others won’t let us. I cannot prove that it is the intent. It is only how the sentence lands on me.

What I will not claim

The easy version says the warnings are marketing, the safety reports are press releases with a different cover, and every cautionary word is timed to move a valuation. I cannot prove any of that, because it is a claim about what people intended, and intentions are the one thing a timeline cannot show you.

I do not need that version. A binding commitment alive at $183 billion was gone by $380 billion. A model called too dangerous to release shipped to forty companies and then leaked. A call to slow down arrived three days after a call to go public. None of that requires me to read anyone’s mind. It only requires me to read the dates.

For the version that does assign motive, there is no shortage of takes. TechRadar’s coverage of the slowdown report ran under the line “they want to build a moat.” A reader on my last article pointed me to an analysis arguing the Mythos warning was framing to lift the pre-IPO valuation. Those readings exist in the world. I am telling you they exist. I am not the one who has to make them for you.

The receipt

In April I said the distinction between responsible AI and a market position was never going to survive the company’s growth rate. I wrote it as a flourish and hoped, a little, to be wrong, the way I was wrong about them once before.

The prediction came due six weeks after I made it. I would have preferred to be slow.

This is not really about Anthropic, and it was not in April either. It is about what happens to any commitment that turns into a disadvantage on a vertical growth curve. The commitment loses. It does not matter how sincere the people holding it are, and it does not matter whose name is on the door. Anthropic was the clearest example I had, not the villain in the story.

The one who loses is me. Not the company that removed the brake and watched its valuation climb. The engineer who built his workflow on that brake being there. I picked the vendor, recommended it by name, called them one of the two companies that got it right, and the thing I was vouching for turned out to be a line in a policy document that got deleted in February, when the company was worth $380 billion.

My mistake was never about Anthropic. It is that I still expect principles to survive in a place where the only thing that finally counts is money. I keep building on the assumption that someone in this industry means the careful thing they say, and the chronology above is what happens to that assumption every time. The disappointment I have been describing across two articles is not in them. It is in me, for needing it to be otherwise.