<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[ From the Trenches]]></title><description><![CDATA[
Practical engineering management lessons learned in the trenches of scaling tech teams.]]></description><link>https://techtrenches.dev</link><image><url>https://substackcdn.com/image/fetch/$s_!mIde!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd575dda4-fcd3-44ee-96f7-2fa1cb11cefa_600x600.png</url><title> From the Trenches</title><link>https://techtrenches.dev</link></image><generator>Substack</generator><lastBuildDate>Fri, 19 Jun 2026 22:32:06 GMT</lastBuildDate><atom:link href="https://techtrenches.dev/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Denis]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[techtrenches@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[techtrenches@substack.com]]></itunes:email><itunes:name><![CDATA[Denis Stetskov]]></itunes:name></itunes:owner><itunes:author><![CDATA[Denis Stetskov]]></itunes:author><googleplay:owner><![CDATA[techtrenches@substack.com]]></googleplay:owner><googleplay:email><![CDATA[techtrenches@substack.com]]></googleplay:email><googleplay:author><![CDATA[Denis Stetskov]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Nobody Answers for the Lie They Sold]]></title><description><![CDATA[Tech CEOs called engineers obsolete, told kids not to code, promised the world. None of it landed, none of them paid. A view from the hiring side of the wreck.]]></description><link>https://techtrenches.dev/p/nobody-answers-for-the-lie-they-sold</link><guid isPermaLink="false">https://techtrenches.dev/p/nobody-answers-for-the-lie-they-sold</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Thu, 18 Jun 2026 19:27:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9d46178d-72f5-481b-91a8-161a0d242157_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0sjC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0sjC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 424w, https://substackcdn.com/image/fetch/$s_!0sjC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 848w, https://substackcdn.com/image/fetch/$s_!0sjC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 1272w, https://substackcdn.com/image/fetch/$s_!0sjC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0sjC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6484280,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/202616728?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0sjC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 424w, https://substackcdn.com/image/fetch/$s_!0sjC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 848w, https://substackcdn.com/image/fetch/$s_!0sjC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 1272w, https://substackcdn.com/image/fetch/$s_!0sjC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb43b9f88-9e38-42ab-8c60-ee9ebb86f09d_2528x1686.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>People get laid off, then they find new work. Maybe it takes longer than it used to. The market is hard, but the market digests. That was the shape of the problem in my head. Painful, survivable, done.</p><p>I&#8217;ve spent almost a year pushing back on the replacement story, and I still think it&#8217;s a lie. The read came from somewhere narrower than belief. It came from where I was standing. I didn&#8217;t get cut. None of my people got cut. And the whole time, I was hiring. Our last round pulled 2,253 candidates for a handful of seats. From a chair that two thousand people are walking toward, &#8220;they&#8217;ll land somewhere&#8221; isn&#8217;t optimism. It&#8217;s what you watch happen every day. That chair is a window, and through it the broken market still looks like a working one.<br>An engineer answered one of my posts. He was standing at the other window: senior, twenty-five years in, a hundred days of runway left. His view: nobody hires, layoffs keep coming but nobody fills the seats, everyone frozen like a rabbit in headlights. Juniors get called useless. He told me, plainly: </p><div class="pullquote"><p>You are not in the street seeing what I see or speaking with the unemployed devs as I am. Don&#8217;t throw optimism at people whose expertise got devalued by months of hype. You can&#8217;t eat the future. The damage is already here.</p></div><p>He was right, and the thing he was right about is narrow and exact. Not the mechanics. I still think the market is irrational and irrational markets correct. I told him as much, and I didn&#8217;t take it back. What I took back was smaller and worse to admit: I had been measuring his catastrophe with the ruler of my own untouched life. I knew people were getting laid off. I did not feel the weight of it, because the weight wasn&#8217;t landing on me. Two people can look straight at the truth and see opposite things, depending on which side of the glass they&#8217;re standing on. Each of us assembles reality from where we happen to stand, and mistakes it for the thing itself. I stand somewhere specific too. I was born where a lie had a price, and that makes a market with no price for lying hard for me to read as normal.</p><h2>The present I missed</h2><p>One story shook me. Kyle Simpson wrote &#8220;You Don&#8217;t Know JS.&#8221; He taught a generation how the language actually works. A few weeks ago he was on LinkedIn asking for warm intros. Not job-board links. Warm connections to people he could actually talk to.</p><p>When a Kyle Simpson has to ask LinkedIn for an introduction, this isn&#8217;t skill losing its value. It&#8217;s a market that believed a lie and seized up. What matters is that a person of that caliber gets spit out at all.</p><p>He is one face, behind him there&#8217;s a number, and I could have read it any time.</p><p>New software engineering job postings fell 15% in the first two months of 2026. Software developers aged 22 to 25 saw employment drop nearly 20% from their late-2022 peak. Tech cut roughly 52,000 jobs in the first quarter of 2026 alone, the worst opening quarter since 2023. Close to 900,000 tech workers gone since 2020. IBM tripled its entry-level hiring this year, the only company moving the other way, which tells you the rest had a choice and made the other one.</p><p>The freeze works from both ends. Juniors never get in, and the people who taught them get walked out. A senior with twenty-five years and a Kyle Simpson land in the same place, because a market that stops believing in the skill stops believing in it at every level.</p><h2>The people who built it</h2><p>What the executives did was a different thing. They sold the replacement on purpose, through the loudest microphones in the industry.</p><p>Sam Altman, March 2025: &#8220;At some point, yeah, maybe we do need less software engineers.&#8221; Mark Zuckerberg, January 2025: AI would be &#8220;a mid-level engineer&#8221; at Meta inside the year. Marc Benioff, February 2025: &#8220;We&#8217;re not going to hire any new engineers this year.&#8221; Dario Amodei, March 2025: AI would write 90% of code within three to six months, and within twelve months &#8220;essentially all of the code.&#8221;</p><p>Read the hedges. &#8220;Maybe.&#8221; &#8220;We may be in a world where.&#8221; The qualifiers sat in the transcripts and were gone by the time the claim reached a layoff memo. The speculation went up on stage wearing a hedge and came down as a fact. I can&#8217;t prove intent, but the pattern reads one way to me: the unhedged version moved the stock, so the unhedged version is the one that left the building.</p><p>It&#8217;s mid-2026. At my own company AI now writes most of the code, and it still hasn&#8217;t made the engineer optional, it&#8217;s done the opposite. Code that writes itself needs a human to judge it and validate it, or it turns into the wreck I&#8217;m about to show you. The writing was never the bottleneck, the judgment was. Their industry-wide version didn&#8217;t even land: independent reads put AI at about half the code even inside the labs making the loudest claims. Meta did not swap its mid-level engineers for a model. It cut thousands of these people while spending billions on different ones.</p><h2>Two ends, one result</h2><p>Code is solved. Generate it, ship it, the engineer is overhead. That&#8217;s what they were selling.</p><p>I audit that claim for a living. Recently a client brought us an application built start to finish by someone with no engineering background. Pure vibe code. They weren&#8217;t live yet. They came with one question: can we launch?</p><p>The thing worked, the features ran when he demoed them. Zero UI, zero UX, but it ran. That&#8217;s the giveaway before you open a single file. Then we opened the files.</p><p>Inside: zero type safety, every variable and every API response untyped. No input validation anywhere. User input flowed straight into database queries, guarded by nothing but if statements and browser alert popups. No tests, the one test file still searched for the framework&#8217;s starter-template placeholder. The production database password committed to the repo in plaintext. Passwords stored in client-side JavaScript, readable by anyone who opens dev tools. Authorization that was pure theater: roles enforced by hiding buttons, bypassable from the browser console in under a minute. Cross-site scripting in several places, with one component escaping its output correctly and the rest not, which is the signature of code assembled without anyone in charge of how it fits together. One 3,000-line source file. Twelve of twenty components past any reasonable size. Business logic duplicated across six files. Dead components wired to nothing. A 10,000-line CSS file with class names like phase14-, phase17-, phase18-, the archaeological layers of one prompt stacked on the last, nothing ever refactored.</p><p>Our verdict was no-go. The only thing standing between that app and a breach in a regulated industry was the review.</p><p>Zero expertise, predictable result.</p><p>Anthropic. The best-paid engineers on the planet, effectively unlimited compute, the company that sells the coding tool. Their own product shipped a single 3,167-line function with zero tests, already in production, serving a revenue stream in the billions. I took <a href="https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude">their engineering culture</a> apart when the source leaked.</p><p>Maximum expertise and zero expertise, opposite ends of the spectrum, and the code rotted the same way at both. If code were solved, the model would have carried the amateur up toward the expert&#8217;s level. Instead it dragged both down to the same floor. The tool didn&#8217;t decide the outcome, the discipline on top of it did, and at both ends there wasn&#8217;t any.</p><p>The amateur asked permission and got told no. The experts asked no one and shipped to production. One of those two apps is live.</p><h2>Somebody is on the hook</h2><p>My code is already written almost entirely by AI, and it changed nothing about the part that matters. Someone still answers for the result, and &#8220;the AI merged it&#8221; has never been a defense anyone accepted. The work the executives called dead is the one thing standing between &#8220;it runs&#8221; and &#8220;it leaked.&#8221; I&#8217;ve written before about why that accountability <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">can&#8217;t be delegated</a>.</p><p>I was right that the work doesn&#8217;t vanish. I was wrong to think that because it doesn&#8217;t vanish, no damage was done. The people who do it got thrown out anyway, not because the work stopped needing them, but because someone sold a story that it did, and the bill for believing that story lands on whoever shipped on the strength of it. The engineer is still the bottleneck, and the engineer is still on the street. I missed the second half because it wasn&#8217;t happening to me.</p><h2>Nobody paid</h2><p>Altman now says he&#8217;s &#8220;delighted to be wrong,&#8221; that the disruption he forecast hasn&#8217;t shown up the way he expected.</p><p>Amodei pivoted to the Jevons paradox, the same 90% reframed as proof that productivity expands to fill the gap.</p><p>Benioff went from promising &#8220;radical augmentation&#8221; to cutting thousands with &#8220;I need less heads&#8221; inside a single month.</p><p>Jensen Huang told a generation in 2024 to stop learning to code and go into farming or biology instead. By 2026 he was calling the idea that AI reduces engineering jobs &#8220;complete nonsense&#8221; and saying the world needs a trillion lines of code. The kids who took the first advice didn&#8217;t enroll. Nobody has explained where the expertise is supposed to come from now.</p><p>No retraction. No correction filed under the same name that made the claim. The forecast that froze the market just got swapped for a cheerier one.</p><p>That&#8217;s the missing institution. There is no reputational cost in this industry for being wrong at full volume.</p><h2>The King of Bullshit</h2><p>The clearest proof of it isn&#8217;t even in this story. Take the man who built a career on deadlines that never arrive. Full self-driving &#8220;next year,&#8221; every year since 2015. A million robotaxis on the road by 2020. A million people on Mars, first crewed mission in 2024. Hyperloop, a &#8220;fifth mode of transport,&#8221; working lines within a few years of 2013. Brain implants in trials by 2020. None of it landed on time, most of it not at all, and by 2024 he was on a stage admitting &#8220;I tend to be a little optimistic with time frames&#8221; while announcing the next one in the same breath. The latest one, my favorite: in January 2026 he told Davos the cheapest place to run AI would be space &#8220;within two years, maybe three at the latest,&#8221; and filed to put a million data-center satellites in orbit, days before merging two of his companies into a $1.25 trillion entity headed for an IPO. The deadline is the product. Every missed date made him richer. He is, as of this writing, the wealthiest person who has ever lived. A forecast that never lands isn&#8217;t a debt in this industry. It&#8217;s a marketing budget, and somebody else pays it.</p><h2>Nothing clears</h2><p>You can call an entire profession obsolete. You can tell a generation not to learn to code. You can promise data centers in space within two years, days before your IPO. You can do all of it, watch the freeze land on real people, and pay nothing. The careers don&#8217;t un-break, the valuations went up. The same vendors now <a href="https://techtrenches.dev/p/when-your-vendor-becomes-your-competitor">sell AI supervision</a> as a service, packaging the exact work they spent years calling dead. A senior engineer with twenty-five years behind him counts how many days of money he has left. They count how many points the stock is up since they called him overhead. The difference between them isn&#8217;t talent and it isn&#8217;t honesty. It&#8217;s altitude, and altitude works so that no one up there ever has to look down at the person they wrote off as a cost.</p><p>And there&#8217;s a worse version than no one paying. The correction he and I both want may never come, not because the story turned out true, but because too much money is riding on it to let the tower fall. Capital that size doesn&#8217;t admit it was wrong. It props the thing up. That&#8217;s slower and quieter than a crash, and worse, because nothing ever clears.</p><h2>Resist</h2><p>He signed off with one word, twice. Resist.</p><p>From the height of the future you can&#8217;t see the rotten code or the broken people. You see them with your hands inside the machine, or you don&#8217;t see them at all. I didn&#8217;t, for a while, because I was standing at the one window where the damage doesn&#8217;t reach. A man with a hundred days of runway walked me to the other window in an afternoon.</p><p>I come from a place where if you lie and get caught, you can get punched in the mouth for it. Where a word costs something the moment it leaves your mouth. These men built an industry where you can lie to the whole world, break other people&#8217;s careers at full volume, and grow richer for it. I can&#8217;t tell you how to build a reputation system where there isn&#8217;t one. I just know what it looks like when it works, because I grew up inside one that did.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[When Your Vendor Becomes Your Competitor: AI’s $5.5B Confession]]></title><description><![CDATA[OpenAI launched a $4B consulting company. Anthropic launched a $1.5B JV. Both sell human supervision for their own models to enterprises that can't deploy AI on their own.]]></description><link>https://techtrenches.dev/p/when-your-vendor-becomes-your-competitor</link><guid isPermaLink="false">https://techtrenches.dev/p/when-your-vendor-becomes-your-competitor</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 16 Jun 2026 14:01:28 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/17b47774-3964-46b5-a46a-0bf7bc3238ec_1016x680.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_9Qd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_9Qd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 424w, https://substackcdn.com/image/fetch/$s_!_9Qd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 848w, https://substackcdn.com/image/fetch/$s_!_9Qd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 1272w, https://substackcdn.com/image/fetch/$s_!_9Qd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_9Qd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png" width="1456" height="1383" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1383,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:174862,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/199628393?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_9Qd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 424w, https://substackcdn.com/image/fetch/$s_!_9Qd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 848w, https://substackcdn.com/image/fetch/$s_!_9Qd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 1272w, https://substackcdn.com/image/fetch/$s_!_9Qd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb245c88b-2c0c-462a-a8f1-43867a8244dd_1600x1520.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Sam Altman and Dario Amodei spent two years predicting AI would displace much of the workforce. Both hedged with talk of augmentation and regulation, but displacement was the headline, and it shaped enterprise expectations, investor theses, and <a href="https://techtrenches.dev/p/ai-wont-save-us-from-the-talent-crisis">hiring decisions</a> across the industry.</p><p>Then, in May 2026, OpenAI launched a $4 billion company that sells human supervision of its own models. To the same enterprises. For the same work thousands of independent engineering firms were already doing. Anthropic did the same thing a week earlier, quieter, for $1.5 billion.</p><p>They didn&#8217;t build a better model. They built a services firm. And the $5.5 billion they spent is the clearest admission yet that enterprises can&#8217;t deploy AI without humans holding their hand.</p><h2>The Vendor Ate the Ecosystem</h2><p>OpenAI&#8217;s <a href="https://openai.com/index/openai-launches-the-deployment-company/">Deployment Company</a> launched May 11 with $4 billion from 19 investors led by TPG, including Goldman Sachs, McKinsey, and Capgemini. Its first hire: 150 forward-deployed engineers from Tomoro, a London consultancy acquired the same day.</p><p>Forward-deployed engineers sit inside your company and make AI do what the demo promised. Palantir <a href="https://newsletter.pragmaticengineer.com/p/forward-deployed-engineers">pioneered the role</a> in the early 2010s because its software didn&#8217;t deploy itself either. The model works. Palantir&#8217;s stock <a href="https://www.macrotrends.net/stocks/charts/PLTR/palantir-technologies/stock-price-history">returned over 1,200%</a> since 2020 on exactly this premise: enterprise software needs permanent human support.</p><p>So copying Palantir isn&#8217;t the problem. The problem is that OpenAI&#8217;s CEO spent two years saying the model makes this work obsolete, then built a $4 billion company to do it.</p><p>The structure gives away the confidence level. DeployCo&#8217;s investors collectively sponsor 2,000+ businesses, a captive market built into the cap table, and OpenAI guaranteed them a 17.5% annual return. That&#8217;s standard private-equity plumbing, except the asset generating the yield is a supervision business, not a software license.</p><p>Every AI services firm that built on OpenAI&#8217;s API now competes against a subsidiary of its own vendor, one that sees the model roadmap first and has thousands of clients pre-sold through the cap table. Axios&#8217;s Dan Primack <a href="https://www.axios.com/2026/05/11/openai-deployco-private-equity">caught the irony</a>: McKinsey and Capgemini invested in DeployCo while competing with it. They funded their own disintermediation, or bought a hedge against it. Either way, they&#8217;re inside the tent.</p><p>AWS and Salesforce competed with their partners too, but offered margin sharing and co-sell programs to keep the ecosystem alive. DeployCo is built as a competitor, not a platform. OpenAI went from tool vendor to rival services firm in eighteen months.</p><p>Then it kept going. A month after DeployCo, OpenAI <a href="https://openai.com/index/openai-to-acquire-ona/">acquired Ona</a>, which runs AI agents securely inside an enterprise&#8217;s own cloud. One acquisition makes a product. Two makes a pattern. OpenAI is buying its way into the deployment layer because the models alone don&#8217;t land there.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>The Middleman They Were Supposed to Kill</h2><p>Anthropic didn&#8217;t build its own engineering arm. It rented everyone else&#8217;s.</p><p>PwC <a href="https://www.anthropic.com/news/pwc-expanded-partnership">signed on May 14</a>: Claude across 364,000 employees, 30,000 to be certified. KPMG <a href="https://www.anthropic.com/news/anthropic-kpmg">followed five days later</a> with &#8220;Digital Gateway Powered by Claude&#8221; for 276,000 people in 138 countries, plus a product for modernizing IT at private-equity portfolio companies.</p><p>These are auditing firms. Tax prep, compliance, M&amp;A advisory, the exact white-collar work every pitch deck promised to automate. Instead of being replaced, they became the sales channel. Enterprises don&#8217;t trust Claude. They trust KPMG. So KPMG stamps &#8220;Powered by Claude&#8221; on its existing workflow and bills both ends: Anthropic for distribution, clients for the supervision.</p><p>You could argue this is augmentation working as designed, and you might be right. PwC reported <a href="https://www.prnewswire.com/news-releases/anthropic-and-pwc-expand-alliance-driving-impact-across-client-work-and-the-firm-302772321.html">up to 70%</a> faster delivery on client work. But augmentation isn&#8217;t what got the funding and the keynotes. Displacement was the pitch, and the gap between that pitch and a Big Four distribution deal is the whole story.</p><p>The model didn&#8217;t cut out the middleman. It entrenched it.</p><p>The pattern is industry-wide. By the end of May, every major lab had bought its way into someone&#8217;s services arm: EY went with Microsoft, Google funneled $750 million into Accenture, Deloitte, and Capgemini. Embedded enterprise engineers aren&#8217;t new; Microsoft and AWS have fielded them since 2017. What&#8217;s new is the model vendors muscling into the same work, on top of a services layer that already existed.</p><p>What all this money buys is a human signature on a risk assessment. Confidence-as-a-service, priced like infrastructure.</p><p>And the bill never stops. When the deployment never gets cheaper from one engagement to the next, you didn&#8217;t buy a capability. You bought a dependency. Forward-deployed engineers are the invoice for making AI real.</p><h2>The Contradiction That Funds Itself</h2><p>Back in 2024, Amodei predicted 50 million genius-level AI entities and <a href="https://darioamodei.com/machines-of-loving-grace">half of white-collar jobs</a> gone in five years, then doubled down in a 20,000-word essay in January 2026. I <a href="https://techtrenches.dev/p/the-country-of-geniuses-that-doesnt">took it apart</a> then: the knowledge required to supervise AI is the same knowledge that makes you irreplaceable.</p><p>Four months later, the money confirmed it. While Altman&#8217;s people were buying a London consultancy, our CTO was staring at a queue of vibe-coded apps that HR, PMs, and even our own CEO had built and now wanted shipped to production. That queue is the real shape of AI in the enterprise: not replacement, but a backlog of half-working software that needs a human to make safe. On May 26, Altman told a Sydney audience he&#8217;d been <a href="https://www.business-standard.com/amp/world-news/ai-unlikely-to-lead-to-jobs-apocalypse-says-openai-ceo-sam-altman-126052600707_1.html">&#8220;pretty wrong&#8221;</a> about AI and jobs, and said he was &#8220;delighted to be wrong.&#8221; More honesty than most CEOs offer. But his company had spent $4 billion two weeks earlier, and that says more than any interview.</p><p>Then Amodei went further. In a <a href="https://darioamodei.com/post/policy-on-the-ai-exponential">June policy essay</a>, he proposed that if AI permanently kills demand for labor, governments may need universal basic income funded by taxes on AI companies. Read that again. The same person, the same company that has spent eighteen months selling labor replacement now wants a tax-funded safety net for the moment that replacement lands. He&#8217;s pricing in the cleanup before the spill. So the same month Anthropic spent $1.5 billion hiring humans to deploy its models, its CEO floated paying the displaced out of a tax on the firms doing the displacing.</p><p>Either way, the math doesn&#8217;t reconcile. If the models replace workers, you don&#8217;t need 150 engineers inside client offices and four Big Four partnerships. If they don&#8217;t, stop writing 20,000-word essays about the country of geniuses. Pick one.</p><p>Lay the receipts in order and the three-week corridor speaks for itself. May 11, OpenAI launches its deployment company. May 14, PwC signs. May 19, KPMG follows. May 26, Altman says he was wrong about replacement. June, Amodei proposes a tax to clean up the replacement. The walk-backs and the checkbook were running on the same calendar.</p><p>The market already picked. OpenAI&#8217;s enterprise API share slid from 50% to <a href="https://techtrenches.dev/p/when-announcements-replace-innovation">27%</a> in two years while Anthropic took the lead. That slide is the motive. When the model stops being a differentiator and the API revenue follows it down, you wrap the model in people and sell the bundle, because a services contract is stickier than an API key. DeployCo is the answer to a losing API war, not a strategic flourish. Even Palantir&#8217;s Alex Karp, whose platform competes directly, <a href="https://www.theregister.com/ai-and-ml/2026/06/11/everyone-hates-frontier-ai-labs-says-palantir-boss/">called the Tomoro deal</a> &#8220;a complete farce&#8221; and an attempt to copy Palantir. &#8220;The implementation is where the value is,&#8221; he said.</p><p>When the company that invented the playbook says you&#8217;re copying it badly, the model was never the moat.</p><h2>What This Looks Like from the Trenches</h2><p>We sell this exact service, at a fraction of $4 billion.</p><p>A client calls because someone on their team vibe-coded an internal tool with ChatGPT, and leadership wants it in production. Nobody knows if the API keys are exposed, whether data leaks outside the VPC, or how to enforce compliance on code no human wrote. They need an engineer to audit it, fix it, and sign off that it won&#8217;t detonate in production.</p><p>The queue isn&#8217;t hypothetical. Our DevOps team built dedicated infrastructure just for it: SSO, automated provisioning, the whole path from someone&#8217;s laptop to a managed environment.</p><p>OpenAI now sells that same work, with insider access to the model, a captive pipeline of thousands of companies, and a brand enterprises trust more than any independent firm. Your vendor is your competitor, with funding you can&#8217;t match, a roadmap you can&#8217;t see, and a client base you can&#8217;t access.</p><p>For buyers, the tradeoff is real. DeployCo&#8217;s engineers are good; Tomoro had real clients and real deployments. But the firm that already knows your infrastructure and your edge cases just lost its information edge to your vendor. And 150 engineers spread across thousands of portfolio companies run thin fast.</p><p>The keynote is free. The $5.5 billion is the honest number, and it just built the most expensive human-supervision layer in enterprise history.</p><p>If Altman and Amodei spent two years telling the world that humans aren&#8217;t needed, they shouldn&#8217;t be surprised when the humans they&#8217;re now competing against take it personally.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[Europe Regulated Itself Out of the AI Race]]></title><description><![CDATA[Meta's &#8364;1.2B fine: three days of revenue. Aleph Alpha's $500M: a dead company. Same rules. Now the US restricts frontier AI to Americans only.]]></description><link>https://techtrenches.dev/p/europe-regulated-itself-out-of-the</link><guid isPermaLink="false">https://techtrenches.dev/p/europe-regulated-itself-out-of-the</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Sat, 13 Jun 2026 15:01:35 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f9606d22-1667-45cf-8470-5c02118d4507_1016x680.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l_hD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l_hD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 424w, https://substackcdn.com/image/fetch/$s_!l_hD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 848w, https://substackcdn.com/image/fetch/$s_!l_hD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!l_hD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l_hD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:194384,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/198693546?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l_hD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 424w, https://substackcdn.com/image/fetch/$s_!l_hD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 848w, https://substackcdn.com/image/fetch/$s_!l_hD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!l_hD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf0c9272-223d-49d6-872c-bd6d5b9c0a65_1600x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In May 2023, Ireland&#8217;s Data Protection Commission fined Meta &#8364;1.2 billion for transferring European user data to the United States. The largest GDPR fine in history.</p><p>Meta generated $200.97 billion in revenue in 2025. The fine equals less than three days of revenue. Meta adjusted its legal transfer framework to the new EU-US Data Privacy Framework and kept operating.</p><p>In mid-2024, across the border in Germany, Aleph Alpha announced it was exiting the foundation-model race. The company had raised more than $500 million from Schwarz Group, SAP, Bosch, Hewlett Packard Enterprise, billed as Germany&#8217;s answer to OpenAI.</p><p>CEO Jonas Andrulis told Bloomberg the math didn&#8217;t work. Just having a European LLM wasn&#8217;t a viable business model. The primary cause was competitive: GPT-4, Claude, and Gemini left no room. But regulatory compliance costs compound an already impossible position.</p><p>By October 2025, Andrulis stepped down. By April 2026, <a href="https://www.cnbc.com/2026/04/24/cohere-aleph-alpha-germany-ai-europe-expansion.html">Canada&#8217;s Cohere</a> acquired what was left at a combined $20 billion valuation. Germany&#8217;s flagship AI company is now a division of a Canadian one.</p><p>Three days of Meta&#8217;s revenue, a dead German foundation-model shop, and GDPR is not new at this.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>The Precedent</h2><p>GDPR took effect in May 2018. <a href="https://cms.law/en/deu/publication/gdpr-enforcement-tracker-report/numbers-and-figures">Cumulative fines</a> have since approached &#8364;6 billion. The top recipients are American and Chinese companies: Meta, Amazon, TikTok, Google. All of them paid and kept operating. Not one left the European market.</p><p>A 2025 <a href="https://www.nber.org/digest/202509/privacy-regulation-and-transatlantic-venture-investment">NBER paper</a> measured what happened next: after GDPR took effect, US-led venture deals in the EU dropped 20.6% in count and 13.2% in dollar amounts. Roughly $1.6 billion per year in lost American investment, concentrated in exactly the kinds of young, data-intensive startups that would be training today&#8217;s European models.</p><p>The <a href="https://commission.europa.eu/topics/strengthening-european-competitiveness/eu-competitiveness-looking-ahead_en">Draghi Report</a> delivered the verdict: no EU company with a market capitalization above &#8364;100 billion has been created from scratch in the last fifty years. Every US company valued above &#8364;1 trillion was founded during that same period.</p><p>US private AI investment in 2024 hit $109 billion. The EU plus UK combined raised approximately $8 billion. Of 147 European unicorns founded between 2008 and 2021, forty relocated to the US. The fastest-growing AI company in Europe, Lovable, is built by a Swedish team but registered in Delaware. You can build in Stockholm, but you incorporate where the checks clear.</p><p>Brussels didn&#8217;t start from a level playing field. Twenty-seven languages, shallow capital pools, and risk-averse investors did plenty of damage first. But regulation is the one factor the EU chose to add on top of every other disadvantage. GDPR did not create a single European tech champion. Its most visible consumer outcome is a cookie banner that annoys 450 million people every time they open a website.</p><p>Brussels is about to aim this model at AI. Mistral is the one company that can&#8217;t ignore it.</p><h2>The Only Boxer in the Ring</h2><p>Mistral is the only European shop that even pretends to play in OpenAI&#8217;s league, backed by ASML&#8217;s &#8364;1.3 billion check in September 2025, valued at &#8364;11.7 billion. OpenAI closed a $122 billion round in March 2026 at an $852 billion valuation. The ratio is roughly 73 to 1.</p><p>Mensch would not have a company without Macron, and that&#8217;s not an insult, it&#8217;s the mechanism. Macron&#8217;s personal endorsement, a French military AI framework deal, an Nvidia data center partnership blessed at VivaTech, and Europe&#8217;s largest industrial investor writing a &#8364;1.3 billion check.</p><p>Strip the state support and you have a company with roughly $400 million in annual revenue competing with an $852 billion one. Markets don&#8217;t produce that outcome, governments do.</p><p>On GPQA Diamond, the hardest reasoning benchmark, Mistral Large 3 scores 43.9%. Gemini 3 Pro scores 91.9%. On the <a href="https://artificialanalysis.ai/">Artificial Analysis</a> Intelligence Index, Mistral ranked in the bottom half as of early 2026, below DeepSeek V3.2.</p><p>Europe&#8217;s champion is losing to the companies Europe is trying to regulate. Mistral CEO Arthur Mensch told the French National Assembly in May 2026 that Europe has two years to avoid becoming America&#8217;s &#8220;vassal state,&#8221; criticizing the stacking of GDPR, copyright legislation, and the AI Act as a system that favors American giants who can absorb compliance costs without noticing.</p><p>Mensch is right about the diagnosis. But the sharpest contrast isn&#8217;t between Europe and America. It&#8217;s between how fast Europe can build and how fast it actually does.</p><h2>What Speed Looks Like</h2><p>The sharpest illustration of what AI development looks like without compliance overhead isn&#8217;t in Silicon Valley. It&#8217;s in Ukraine, where I live and run engineering teams.</p><p>A company called The Fourth Law makes an autonomy module that costs around $150 per unit in its cheapest configuration. The drone locks onto its target and flies the final approach without human input, immune to radio jamming. Hit rates jump from 20% to 80%.</p><p>A <a href="https://www.csis.org/analysis/ukraines-future-vision-and-current-capabilities-waging-ai-enabled-autonomous-warfare">CSIS report</a> confirmed the pattern: AI-enabled navigation raises engagement success from 10-20% to 70-80%. Instead of eight or nine drones per target, one or two are enough.</p><p>CSIS documented the approach: small models trained on small datasets, running on cheap chips, designed for fast retraining as battlefield conditions change.</p><p>No conformity assessments, no technical documentation packages, no risk management frameworks. The feedback loop is measured in days: a module ships to a brigade, data comes back, the model gets retrained, next version ships.</p><p>Brave1, the government defense tech cluster, supports over 1,500 Ukrainian tech companies. More than 300 AI innovations registered. Over 70 deployed on the front lines.</p><p>I <a href="https://techtrenches.dev/p/silicon-valley-eats-the-war">wrote recently</a> about how AI&#8217;s infrastructure appetite is starving Ukraine&#8217;s drone supply chain. That was about resources. This is about regulation. The EU is writing rules for AI systems that would qualify as high-risk under its own classification. The same class of systems Ukraine deploys in weeks and funds with a fraction of what Mistral spends on compliance lawyers.</p><p>Nobody in Kyiv is asking whether their autonomous navigation module has an adequate risk management system under Article 9 of the AI Act. They&#8217;re asking whether it hits the target.</p><p>The AI Act excludes military systems. So the part of the stack that&#8217;s been stress-tested under artillery fire is the part Brussels doesn&#8217;t regulate, while it drowns civilian use cases in paperwork.</p><h2>The Rules Hit Where They&#8217;re Easiest to Enforce</h2><p>The EU AI Act threatens fines of up to &#8364;35 million or 7% of global revenue. For Mistral, at roughly $400 million in annual revenue, first-year compliance for a single high-risk system runs &#8364;80,000 to &#8364;250,000.</p><p>The compliance invoice doesn&#8217;t care about your revenue. A startup with five engineers pays the same auditor as Meta.</p><p>In a 2023 survey of over a hundred EU AI startups, 33% believed their systems would be classified as high-risk. The European Commission assumed 5 to 15 percent.</p><p>American companies have a third option: withhold features. Apple held back Apple Intelligence. Meta sat on multimodal Llama for the EU. Google quietly slid Gemini&#8217;s launch by a few quarters. None of them argued with Brussels. They just downgraded the product for 450 million people.</p><p>Chinese companies face even less friction. Italy banned DeepSeek in January 2025 for refusing to acknowledge GDPR jurisdiction. No global financial penalty. DeepSeek remains accessible via VPN with no EU entity to enforce against.</p><p>TikTok received a &#8364;530 million fine in May 2025 for illegal data transfers to China, appealed, and continues operating. ByteDance is valued between $550 and $600 billion in secondary market transactions. The fine is less than 0.1% of the company&#8217;s value.</p><p>Brussels noticed.</p><h2>Brussels Heard the Message. Too Late.</h2><p>In May 2026 alone, the EU postponed its own high-risk AI deadlines by over a year, expanded SME exemptions, relaxed GDPR provisions for AI training data, and announced a &#8364;200 billion investment program. You don&#8217;t postpone your own law by two years because it&#8217;s going well.</p><p>But timing is the problem. GDPR took effect in 2018. The capital flight began immediately. Eight years of underinvestment can&#8217;t be reversed by a program announced in 2026.</p><p>According to Revelio Labs data reported by ScienceBusiness, France saw a net outflow of 45% of its AI researcher base in 2025 even as Mistral was scaling. The country that hosts Europe&#8217;s best AI company is losing researchers fastest.</p><p>I <a href="https://techtrenches.dev/p/the-grok-precedent-why-ai-creators">wrote before</a> that some guardrails aren&#8217;t anti-innovation. I stand by that. AI companies that generate child abuse material should face criminal prosecution.</p><p>There&#8217;s a version of this where the EU says: don&#8217;t build dangerous things. That&#8217;s not what happened. What happened is: don&#8217;t build anything unless you can pay someone to document that it&#8217;s not dangerous.</p><p>I&#8217;ve seen this before. I <a href="https://techtrenches.dev/p/the-west-forgot-how-to-make-things">wrote before</a> about the EU promising Ukraine a million artillery shells and delivering half, nine months late.</p><p>Brussels loves the phrase &#8220;AI sovereignty.&#8221; In practice, sovereignty means the ability to build and run your own systems. The EU runs its AI on American models, assembles its hardware from Chinese factories, and relies on Ukrainian soldiers for the security that lets it hold regulatory hearings. That&#8217;s the sovereignty it&#8217;s defending.</p><h2>The Gift Europe Won&#8217;t Unwrap</h2><p>On June 9, 2026, Anthropic launched Claude Fable 5 to the public. Its most capable model, sharing weights with Claude Mythos 5, the version withheld for vetted cyber-defense partners.</p><p>Three days later, on Friday, June 12, the US Commerce Department sent Anthropic an export control directive: cut off Fable 5 and Mythos 5 from all foreign nationals, including those living inside the United States, including Anthropic&#8217;s own employees. Citizenship is the line. A green card holder who has lived in San Francisco for a decade, who works at an American company and pays American taxes, is cut off.</p><p>Anthropic said it believes the order may rest on a misunderstanding. Then it took both models offline for everyone because it has no way to verify every user's citizenship. A model that launched on Tuesday as the most capable public AI was gone by Friday night.</p><p>Older Anthropic models keep running. The frontier just got a citizenship requirement.</p><p>One directive, one Friday night, and the model is gone for everyone. Do that three more times and you&#8217;ve restructured who can work on frontier AI globally.</p><p>Those &#8220;AI sovereignty&#8221; slide decks that have been gathering dust in Commission offices since 2021 are not hypothetical anymore. Yesterday it was Fable 5. Next time it might not come back.</p><p>Brussels will publish something. Probably several things. A position paper, a roadmap, a working group with a three-year mandate. What it won&#8217;t do is ship a model.</p><p>Europe can build. GDPR took four years from proposal to enforcement. The AI Act took three. AI moves in quarters.</p><p>Ukrainians understand something that Brussels hasn&#8217;t learned in five years of watching this war. You don&#8217;t inherit capability. You build it while something is trying to kill you, or you don&#8217;t have it when you need it.</p><p>The text of the rules is the same for everyone. The bill for following them isn&#8217;t. Either Brussels never ran that spreadsheet, or it did and hit Send anyway.</p><p>The Fable 5 directive should settle one question that Europe has been avoiding since 2022. The United States is not a technological ally. It is a competitor that will restrict access to its best tools the moment it decides to. Europe needs to stop writing rules for American products and start building alternatives before the next Friday night directive takes away something it can&#8217;t replace.</p><p>It&#8217;s painful to watch alliances that held for decades fall apart in a few years. It&#8217;s worse to watch from a country that&#8217;s paying the price for that collapse.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[Anthropic Kept Every Promise It Could Afford]]></title><description><![CDATA[Anthropic made one binding safety promise in 2023 and removed it the month it got expensive. The chronology, from $4 billion to a $965 billion IPO]]></description><link>https://techtrenches.dev/p/anthropic-kept-every-promise-it-could</link><guid isPermaLink="false">https://techtrenches.dev/p/anthropic-kept-every-promise-it-could</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 09 Jun 2026 14:02:10 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3253f165-d002-4a23-af04-77af71c7f072_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dux_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dux_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 424w, https://substackcdn.com/image/fetch/$s_!Dux_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 848w, https://substackcdn.com/image/fetch/$s_!Dux_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!Dux_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dux_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png" width="1456" height="1165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1165,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:161591,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/200887075?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dux_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 424w, https://substackcdn.com/image/fetch/$s_!Dux_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 848w, https://substackcdn.com/image/fetch/$s_!Dux_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!Dux_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86fe544d-5131-43f2-ae23-dc292a7ed455_1600x1280.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In April I wrote that &#8220;responsible AI company&#8221; was always a market position, not a moral one. I said that at the speed Anthropic was growing, the distinction between the two was never going to survive. That was a <a href="https://techtrenches.dev/p/i-was-wrong-about-anthropic">prediction</a> dressed as a closing line, and I half meant it as rhetoric.</p><p>Six weeks later I have the receipt, and I am not happy about it.</p><p>The last binding commitment Anthropic ever made was already gone by the time I wrote that. Between that article and this one, the company raised at a valuation that put it ahead of OpenAI, filed to go public, and then published a warning that the technology might be getting too dangerous to keep building. In that order. I predicted the destination. I did not expect to watch it drive there this fast, narrating the whole way.</p><p>So instead of another closing line: the dates, the documents, the amounts.</p><h2>The promise, and what replaced it</h2><p>In September 2023, Anthropic published the first version of its Responsible Scaling Policy. The document made one commitment that actually bound the company: it would pause development if its models outran its ability to keep them safe. Everything else in the policy described process. That one line described a brake. At the time, the company was valued around $4 billion.</p><p>In February 2026, Anthropic published version 3.0 of the same policy. The brake was gone. In its place: a set of &#8220;Frontier Safety Roadmaps,&#8221; which the company describes as goals it will publish and grade itself against. The single line that could have prevented a release was replaced with one that documents it.</p><p>Anthropic did not hide this, and its chief science officer explained the reasoning to <a href="https://time.com/7380854/exclusive-anthropic-drops-flagship-safety-pledge/">Time</a>. Jared Kaplan said the company no longer felt unilateral commitments made sense &#8220;if competitors are blazing ahead.&#8221; He argued the change was actually a renewed commitment to safety, on the logic that one company pausing while the rest of the industry sprints does not make the world safer. He is not wrong about the logic. That is the part worth sitting with. The most safety-focused company in the industry looked at its own founding promise and removed it. Not because anyone forced them. The race made the brake a liability, and the brake came out. That is what should worry you: the reasoning was sound, no one had to lie, and the safest commitment anyone in the field had made still did not survive once it became a disadvantage.</p><p>That is the whole thesis of my April article, except now it is in their changelog rather than in my opinion column.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>The numbers that arrived on schedule</h2><p>The same month it revised the policy, Anthropic raised $30 billion at a $380 billion valuation. That round closed on February 12.</p><p>On May 28, the company announced a <a href="https://www.cnbc.com/2026/05/28/anthropic-open-ai-startup-value.html">further raise</a>: $65 billion at a $965 billion post-money valuation. The number is worth reading twice. It put Anthropic ahead of OpenAI, which sat around $852 billion, making the research lab that branded itself on caution the most valuable AI company on earth.</p><p>Anthropic confirmed on June 1 that it had <a href="https://www.anthropic.com/news/confidential-draft-s1-sec">submitted</a> a confidential draft registration to the SEC for an initial public offering. In April, an IPO was rumored for October. Now it is a filing. The company is careful to say the offering depends on market conditions and that nothing is set. The document is real regardless.</p><p>Run the dates next to each other and a sequence appears on its own. The binding safety commitment existed when the company was worth $4 billion. It survived to $183 billion in September 2025. It was removed at $380 billion. The IPO paperwork arrived at $965 billion. I am not claiming the valuation caused the policy change, or that anyone sat in a room and traded one for the other. I cannot see inside the company and neither can you. I can see the dates.</p><h2>The same man, the same line, seven years apart</h2><p>There is one person who connects the bookends of this story, and following him is more useful than guessing at anyone&#8217;s motives.</p><p>In February 2019, OpenAI announced it had built a language model called GPT-2 that it considered too dangerous to release in full. The company <a href="https://openai.com/index/better-language-models/">withheld</a> the complete model and let it out in stages over the rest of the year. The strategy and the public case for it came out of OpenAI&#8217;s policy team, run by its policy director, Jack Clark. Dario Amodei led the research team that built it. I credited Amodei with the decision to hold it back when I wrote about this in April. That was sloppy. He built the model; the call not to ship it was the policy team&#8217;s, and Clark made the public case for it.</p><p>Clark co-founded Anthropic in 2021.</p><p>In April 2026, Anthropic announced <a href="https://www.euronews.com/next/2026/04/22/hackers-breach-anthropics-too-dangerous-to-release-mythos-ai-model-report">Mythos</a>, a model it said could find thousands of unpatched security holes across every major operating system and browser. Too dangerous for public release. The company put it behind a limited program for around forty companies instead. An unauthorized group reached it the same day it was announced, using a contractor&#8217;s access to one of those third-party vendor environments. Too dangerous for the public, open on day one to anyone who found the door.</p><p>Then, on June 4, three days after the IPO filing, Anthropic published a report titled &#8220;When AI builds itself.&#8221; Marina Favaro and Jack Clark wrote it. The report says AI is now accelerating AI development, that more than 80 percent of the code the company ships is written by its own model, and that the world should preserve the option to slow down before the technology runs ahead of our ability to govern it. The slowdown it proposes is conditional. Anthropic would pause, the report says, only if competitors at the frontier verifiably did the same.</p><p>So the policy director who explained why GPT-2 was too dangerous to release in 2019 co-wrote the argument for slowing down frontier AI in 2026, three days after his company filed to go public, while that company&#8217;s own dangerous model was already out the door. I do not think Clark is cynical. He keeps arriving at the same honest concern. It keeps hitting the same competitive wall. The view from the wall has just gotten more expensive.</p><p>The conditional is the part I keep turning over. A pledge to slow down only if every other frontier lab verifiably does the same is a pledge that never has to be kept, because one of them never will. You could call that caution. To me it reads as a company turning its own broken word into a fact about the industry rather than about itself: we would hold the line; the others won&#8217;t let us. I cannot prove that it is the intent. It is only how the sentence lands on me.</p><h2>What I will not claim</h2><p>The easy version says the warnings are marketing, the safety reports are press releases with a different cover, and every cautionary word is timed to move a valuation. I cannot prove any of that, because it is a claim about what people intended, and intentions are the one thing a timeline cannot show you.</p><p>I do not need that version. A binding commitment alive at $183 billion was gone by $380 billion. A model called too dangerous to release shipped to forty companies and then leaked. A call to slow down arrived three days after a call to go public. None of that requires me to read anyone&#8217;s mind. It only requires me to read the dates.</p><p>For the version that does assign motive, there is no shortage of takes. TechRadar&#8217;s coverage of the slowdown report ran under the line &#8220;they want to <a href="https://www.techradar.com/ai-platforms-assistants/they-want-to-build-a-moat-anthropics-scary-warnings-about-rapid-ai-self-improvement-and-temporarily-pausing-development-arent-convincing-the-cynics">build a moat</a>.&#8221; A reader on my last article pointed me to an analysis arguing the Mythos warning was framing to lift the pre-IPO valuation. Those readings exist in the world. I am telling you they exist. I am not the one who has to make them for you.</p><h2>The receipt</h2><p>In April I said the distinction between responsible AI and a market position was never going to survive the company&#8217;s growth rate. I wrote it as a flourish and hoped, a little, to be wrong, the way I was wrong about them once before.</p><p>The prediction came due six weeks after I made it. I would have preferred to be slow.</p><p>This is not really about Anthropic, and it was not in April either. It is about what happens to any commitment that turns into a disadvantage on a vertical growth curve. The commitment loses. It does not matter how sincere the people holding it are, and it does not matter whose name is on the door. Anthropic was the clearest example I had, not the villain in the story.</p><p>The one who loses is me. Not the company that removed the brake and watched its valuation climb. The engineer who built his workflow on that brake being there. I picked the vendor, recommended it by name, called them one of the two companies that <a href="https://techtrenches.dev/p/from-cancer-cures-to-pornography">got it right</a>, and the thing I was vouching for turned out to be a line in a policy document that got deleted in February, when the company was worth $380 billion.</p><p>My mistake was never about Anthropic. It is that I still expect principles to survive in a place where the only thing that finally counts is money. I keep building on the assumption that someone in this industry means the careful thing they say, and the chronology above is what happens to that assumption every time. The disappointment I have been describing across two articles is not in them. It is in me, for needing it to be otherwise.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Nobody Won the Token Race]]></title><description><![CDATA[Uber burned its AI budget in four months chasing token leaderboards, then capped engineers at $1,500. We never hit a limit. The plan is the variable]]></description><link>https://techtrenches.dev/p/nobody-won-the-token-race</link><guid isPermaLink="false">https://techtrenches.dev/p/nobody-won-the-token-race</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Wed, 03 Jun 2026 14:01:07 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/363e880b-dc64-4156-b496-e84fb3630d74_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dozC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dozC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 424w, https://substackcdn.com/image/fetch/$s_!dozC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 848w, https://substackcdn.com/image/fetch/$s_!dozC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!dozC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dozC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:104803,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/199889668?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dozC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 424w, https://substackcdn.com/image/fetch/$s_!dozC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 848w, https://substackcdn.com/image/fetch/$s_!dozC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!dozC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F518d1df9-df0f-4ef1-85e0-99c1682a90b9_1600x1200.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I pay $200 a month for Claude Code. The rest of the company runs on the corporate plan at $125 a head. PMs, engineers, product, and marketing all use it. I have never once hit a usage limit. A couple of the engineers running at 5x throughput bump into it occasionally, but rarely.</p><p>We didn&#8217;t lay anyone off, and we&#8217;re hiring into every department.</p><p>We never had to fight the token bill, because we were never burning tokens for the sake of burning tokens, and the bill stayed boring because the usage stayed honest.</p><p>Here&#8217;s the objection I can already hear: we&#8217;re seventy people, not five thousand, so of course the bill is small. Except scale isn&#8217;t what moves a token bill. My $200 Max sub and the team&#8217;s $125 Premium seats are capped: you hit the ceiling, the window resets, the cost is known in advance. Uber put its 5,000 engineers on enterprise billing, where every token is metered on top of the seat fee with no ceiling, then ranked teams on a leaderboard by how many of those uncapped tokens they burned. That&#8217;s not two decisions, it&#8217;s one. Picking the meter with no ceiling and rewarding people for running it hard are the same managerial move: optimize activity, pay for activity. At Uber&#8217;s scale the contract is custom and metered by default, so the ceiling isn&#8217;t a checkbox, it&#8217;s something finance has to negotiate for, and they didn&#8217;t. The number of engineers was never the variable. The structure you chose and the behavior you rewarded inside it were. As of this week Uber agrees: it just <a href="https://www.bloomberg.com/news/articles/2026-06-02/uber-caps-usage-of-ai-tools-like-claude-code-to-cut-costs">capped</a> engineers at $1,500 a month per tool, the ceiling it took a year and a blown budget to want.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>What the Industry Decided to Measure</h2><p>Uber wasn&#8217;t alone, and the cap came late. Over the last year a lot of companies made the same quiet decision: usage became the metric. Not output or shipped features or revenue, just consumption, a number that went on a dashboard, and numbers on dashboards become things people optimize.</p><p>Uber&#8217;s leaderboard ranked teams by how much AI tooling they used, and adoption of agentic coding jumped from 32% in February to 84% in March, until 95% of engineers were touching it monthly. The point of the board wasn&#8217;t to ship more, it was to use more, and it worked.</p><p>At Meta, an employee built a leaderboard called &#8220;Claudeonomics&#8221; that ranked around 85,000 workers by token consumption, sixty trillion tokens in thirty days, with leadership publicly cheering the token race as a productivity signal until the dashboard leaked and got pulled within two days.</p><p>Amazon built a usage leaderboard, told staff the token stats wouldn&#8217;t count toward performance reviews, and watched employees <a href="https://www.hcamag.com/us/specialization/transformation/amazon-workers-are-gaming-the-ai-leaderboard-hr-built-it/575083">game it anyway</a> because nobody believed them. Duolingo went further and actually tied evaluations to AI adoption, then reversed it when staff pointed out it rewarded tool usage instead of results. Different companies, same move: give people unlimited access and a culture that treats consumption as a virtue, and the bill writes itself.</p><h2>The Metric Was Never Measuring Productivity</h2><p>A high token count is not a signal that a lot got done, it&#8217;s a signal that AI got used without a reason. The companies that built usage metrics assumed consumption tracked productivity, when it tracks the absence of intent.</p><p>When you use AI to solve an actual problem, the usage is bounded by the problem, because there&#8217;s only so much actual work. This holds whether a human is prompting or an agent is running overnight: an agent pointed at a real task refactors what needs refactoring and stops. What doesn&#8217;t stop is an agent pointed at nothing in particular, re-running and re-checking because no one defined where done is. You don&#8217;t hit the ceiling when the work has an edge. You hit it when the work was never the point.</p><p>This is Goodhart&#8217;s law with a token meter attached. Uber&#8217;s own COO, Andrew Macdonald, <a href="https://finance.yahoo.com/sectors/technology/articles/uber-coo-andrew-macdonald-says-130036457.html">put it plainly</a>: it&#8217;s very hard to draw a line between the token spend and actual consumer improvements. That&#8217;s the tell. When you can measure the input down to the token but can&#8217;t connect it to the output, you were measuring the wrong thing.</p><p>Tokens aren&#8217;t even the first version of this mistake. Y Combinator&#8217;s Garry Tan spent the spring posting his lines-of-code totals like box scores: 37,000 LOC a day across five projects, a 72-day shipping streak, his whole Claude Code setup open-sourced so everyone could match the number. Then a developer <a href="https://x.com/Gregorein/status/2038953944475472316">opened the blog</a> all that throughput produced and counted 78,400 lines of what he called AI slop in production. Lines of code, like tokens, measure how much the machine ran, not whether anything worth shipping came out.</p><p>They built systems that rewarded exactly the behavior that creates no value, then expressed surprise at the invoice.</p><h2>Why Our Bill Is Boring</h2><p>We use AI when it solves the task in front of us and not otherwise. That&#8217;s the entire policy. The spec-driven approach I&#8217;ve written about before forces clarity before a single token gets spent: you specify the problem, the AI works the problem, and there&#8217;s no &#8220;let&#8217;s see what it comes up with,&#8221; the prompt that quietly multiplies your bill by ten. The same instinct governs the model: I&#8217;m on 4.5 and 4.6, not the newest release. When 4.7 shipped with a tokenizer that generates <a href="https://techtrenches.dev/p/the-ai-industrial-transformation">up to 35% more tokens</a> for the same input, I didn&#8217;t move, because there was no reason to. The older models do the work on fewer tokens. Chasing the newest model and gaming a usage leaderboard are the same instinct wearing two outfits: consumption mistaken for progress.</p><p>We never made using AI the point, and we use it constantly. PMs run tickets through it against our templates and acceptance criteria. QA runs bug reports through it so they&#8217;re clear enough for anyone to act on. Marketing crawls the web with it for angles. Product builds per-client RAG out of meeting notes, docs, and history. Engineering, obviously. Every department vibe-codes its own internal tooling, and then the CTO rewrites the worst of it like a human being. The usage is enormous. It just isn&#8217;t stupid. The bill stays flat not because we use AI less, but because every run has a task attached, and a task has an edge. We never tried to take the human out of the loop. The AI is a tool the person reaches for, not a replacement we&#8217;re proving out, so nobody is burning tokens to hit a number or make a headcount go away. The work still belongs to a person. The bill is just what the tool cost them.</p><h2>The Uncomfortable Part</h2><p>The pitch for all of this was replacement: AI would do the work and cut the cost, fewer people, smaller payroll, same output.</p><p>The trouble is the meter has no fixed relationship to a salary. A single autocomplete costs a fraction of a cent; running Claude Code as an autonomous agent across a monorepo can burn thousands in an afternoon, Uber&#8217;s own CTO spent $1,200 in a two-hour demo and later said the year&#8217;s budget was gone four months in. Average engineers at Uber ran $150 to $250 a month, heavy ones $500 to $2,000, and the leaderboard rewarded the heavy end. Put 5,000 of them on an uncapped meter that pays them to run it hot and the per-head bargain is what turns into the overrun that ended the experiment. On the heaviest agentic workloads the trade flips outright: Nvidia&#8217;s VP of applied deep learning told Axios that for his team the cost of compute is already past the cost of the employees. Then add the people the pitch forgot, the prompt engineers, the eval pipelines, the reviewers, the supervisor rebuilding everything each time a model version changes behavior. The token bill doesn&#8217;t replace payroll. It lands on top of a thinner one.</p><p>And it doesn&#8217;t do the work that actually needed a person. This isn&#8217;t one company&#8217;s bad call, it&#8217;s a pattern with numbers on it. Orgvue surveyed more than 1,100 executives, and among those who&#8217;d cut staff for AI, <a href="https://gfmag.com/technology/companies-face-ai-buyers-remorse/">55%</a> say they regret it. A Careerminds survey of HR teams who&#8217;d run AI layoffs found <a href="https://sea.peoplemattersglobal.com/news/workforce-planning/ai-layoffs-backfire-as-33percent-of-companies-lose-critical-skills-and-expertise-report-48771">most</a> had already rehired a third to half the roles within months, and nearly a third said rehiring cost more than the automation saved. AI handled the tickets that never needed a person and broke on the ones that did.</p><p>The savings were supposed to come from replacing people, and replacing people is the one thing it can&#8217;t do, so the savings never arrived. To get there, <a href="https://www.tomshardware.com/tech-industry/artificial-intelligence/talent-over-tokens-ai-models-are-becoming-more-expensive-to-run-and-productivity-gains-are-limited-efficient-workers-might-be-the-solution-to-strained-budgets">nearly 80,000</a> tech jobs went in the first quarter with companies pinning the blame on AI. Among the ones that rehired, nearly a third found the rehire cost more than the cut had saved. They let go of the people who understood why the work mattered, kept a tool that can&#8217;t replace them, and called it the smart move. For what? If the goal was saving money, the layoffs aren&#8217;t a side effect of efficiency, they&#8217;re a loss booked as one.</p><p>The companies that read it right did the opposite. IKEA let its bot Billie take the <a href="https://www.ingka.com/newsroom/ai-and-remote-selling-bring-ikea-design-expertise-to-the-many/">47%</a> of queries that were routine, then looked at the other 53%, the ones that needed taste and judgment, and reskilled 8,500 call-center workers into remote design advisers instead of cutting them. The AI handled the part that was never the point. The people kept the part that was.</p><p>And the safe choice is getting harder to make. On June 1 GitHub moved every Copilot plan to <a href="https://www.theregister.com/ai-and-ml/2026/06/02/github-copilot-users-threaten-exit-as-metered-billing-kicks-in/5249826">usage-based billing</a>: autocomplete stays free, but chat and the agentic modes now draw on a monthly pool of token credits, and when it runs dry you pay by the token. One Pro+ user torched 8% of the allotment in two hours doing work that used to be a fixed cost. Anthropic does the same on June 15, splitting  programmatic usage onto a separate metered credit at API rates while interactive use stays flat. Both vendors are carving the agentic layer off the flat fee, so the capped plan that keeps a bill predictable is exactly the thing being phased out.</p><p>Per-token prices are climbing regardless, subsidies are ending, and <a href="https://techtrenches.dev/p/the-ai-industrial-transformation">the economics are tightening</a> for everyone. None of that is the part you control. What you control is whether the spending has a task attached to it, or just a number to grow.</p><p>A well-used AI is a great intern, and intentional usage keeps the intern affordable. It doesn&#8217;t change what the intern is. Spend less and you&#8217;re left with the same tool, minus the giant bill.</p><p>If consumption is up and you can&#8217;t draw a clean line from a token to a shipped outcome, you don&#8217;t have a productivity story. You have a leaderboard.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Honk Is Not Magic. It’s 15 Years of Infrastructure With the Context Stripped Out.]]></title><description><![CDATA[Spotify told investors 99% of engineers use AI weekly. Their engineering blog tells a different story. Five stages, four months, and numbers that only go up.]]></description><link>https://techtrenches.dev/p/honk-is-not-magic-its-15-years-of</link><guid isPermaLink="false">https://techtrenches.dev/p/honk-is-not-magic-its-15-years-of</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Wed, 27 May 2026 20:14:14 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/67b7e578-3e3a-4733-bea2-a616bb36751c_540x361.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NO-k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NO-k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 424w, https://substackcdn.com/image/fetch/$s_!NO-k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 848w, https://substackcdn.com/image/fetch/$s_!NO-k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 1272w, https://substackcdn.com/image/fetch/$s_!NO-k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NO-k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png" width="1456" height="1383" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1383,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:164454,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/198684378?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NO-k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 424w, https://substackcdn.com/image/fetch/$s_!NO-k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 848w, https://substackcdn.com/image/fetch/$s_!NO-k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 1272w, https://substackcdn.com/image/fetch/$s_!NO-k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb73286a-cad8-463b-a7d5-059e20b65058_1640x1558.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m allergic to bullshit when it comes to this stuff. When executives spend months bouncing between stages telling everyone AI already does everything for them, it gets under my skin. Not because they&#8217;re lying, necessarily. Because the context disappears somewhere between the engineering blog and the keynote, and the audience fills the gap with their own hopes. A few weeks ago, when Claude&#8217;s codebase <a href="https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude">leaked publicly</a>, I looked at what was actually inside and broke down the &#8220;AI writes all the code&#8221; narrative. Today I&#8217;ll explain why it possibly works for Spotify, and why it&#8217;s not that simple.</p><p>Yesterday, Anthropic published a&nbsp;<a href="https://claude.com/blog/code-w-claude-london-2026-rethinking-how-we-build">recap</a>&nbsp;of their Code with Claude London conference, calling it &#8220;rethinking how we build,&#8221; with Spotify as the first named customer. Spotify didn&#8217;t rethink how they build. They spent 15 years building Backstage, Fleet Management, and a Java BOM with 96% adoption, then plugged Claude Code into a system that was already automating half their PRs. That&#8217;s not rethinking. That&#8217;s a better interface to something that already worked.</p><p>But the marketing loop is now recursive. Spotify tells the Honk story on Anthropic&#8217;s stage. Anthropic writes a blog about Spotify telling the story. Investor Day picks it up, the numbers go up each time, and here's the timeline.</p><p>On May 21, 2026, Spotify held its <a href="https://newsroom.spotify.com/2026-05-21/investor-day-recap/">Investor Day</a> in New York. Co-CEO Gustav S&#246;derstr&#246;m and VP of Engineering Niklas Gustavsson told investors that 99% of Spotify engineers now use AI weekly, 73% of code contributions are AI-assisted, and Honk, their internal coding agent, is now part of a broader story about the Large Taste Model and personalized monetization. Two days earlier, Gustavsson had given the same talk at <a href="https://claude.com/code-with-claude/session/ldn-coding-is-no-longer-the-constraint-scaling-devex-to-teams-and-agents-at-spotify">Code with Claude</a> in London, Anthropic&#8217;s developer conference. The number there was 96%. It went up before the slides changed.</p><p>In February, S&#246;derstr&#246;m <a href="https://techcrunch.com/2026/02/12/spotify-says-its-best-developers-havent-written-a-line-of-code-since-december-thanks-to-ai/">told analysts</a> his best engineers haven&#8217;t written a single line of code since December. Two days later, Anthropic closed a $30 billion <a href="https://www.cnbc.com/2026/02/12/anthropic-closes-30-billion-funding-round-at-380-billion-valuation.html">funding round</a>. In March, the two companies shared a <a href="https://engineering.atspotify.com/2026/4/anthropic-agentic-development">stage in London</a>. In April, Spotify launched as a Claude connector. On May 19, Gustavsson presented at Code with Claude London with 96%. On May 21, Investor Day in New York with 99%.</p><p>Five moments in four months, the audience rotates, the numbers go up, and the engineering blog stays the same.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>The Wrong Metric</h2><p>I don&#8217;t write code either. AI does it for me. And I work more than I ever did when I wrote every line myself. The code was never the hard part. The hard part is the review documentation nobody wants to write, the architecture decision that turns out wrong six months later, the junior who needs fifteen minutes of your time at exactly the wrong moment. AI took the typing and gave back a review queue that never ends, constant context switching, and a steady stream of &#8220;does this diff actually do what I asked&#8221; that I didn&#8217;t have before.</p><p>And &#8220;99% use AI weekly&#8221; means what exactly? Opened Copilot once this quarter? Used Claude to generate a regex? Ran a Honk migration on a fleet of repos? The metric has no definition, which means it has no meaning. &#8220;73% of code contributions are AI-assisted&#8221; is equally hollow without knowing what counts as a contribution. Config changes, dependency bumps, and feature flag flips are all contributions.</p><p>Nobody on that call asked about revert rates. Nobody asked if defects went up. &#8220;50+ features in 2025&#8221; from a company with over seven thousand employees. Honk handles migrations and dependency updates, not feature development. Their own blog is clear about this. But the stage narrative packages it as a product velocity story, and nobody makes the distinction.</p><p>The bottleneck at Spotify was never typing speed.</p><h2>What Honk Actually Does</h2><p>Honk is Spotify&#8217;s internal background coding agent built on Claude Code. Anthropic&#8217;s Boris Cherny is <a href="https://engineering.atspotify.com/2025/11/context-engineering-background-coding-agents-part-2">quoted directly</a> inside Spotify&#8217;s own engineering blog as an endorsement, and Anthropic&#8217;s Applied AI team worked on the integration. The <a href="https://engineering.atspotify.com/2025/11/spotifys-background-coding-agent-part-1">three-part blog series</a> by Max Charas and Marc Bruggmann (November-December 2025) is the most detailed public source.</p><p>An engineer writes a prompt through Slack or a version-controlled file in Git. Honk runs Claude Code in a sandboxed Kubernetes Job. Three tools: verify, Git, Bash allowlist. Ten turns, three retries, then a PR.</p><p>That&#8217;s it. A thin wrapper around Claude Code, plugged into an automation pipeline that existed years before AI.</p><p>As of November 2025, Honk had merged 1,500+ PRs total. Anthropic&#8217;s customer page reports <a href="https://claude.com/customers/spotify">650+ monthly PRs</a>. Fleet Management, the system Honk sits on top of, processed 652,000 automated PRs in 2024 per <a href="https://www.splunk.com/en_us/blog/ciso-circle/spotify-fleet-management-lessons.html">Splunk&#8217;s recap</a> of Spotify&#8217;s PlatEngDay data. Honk adds a useful layer to an already massive automation system. But from the stage narrative, you&#8217;d think Honk is the system.</p><p>The blog is candid about limitations. No code search or documentation tools are exposed to the agent. Verifiers only run on Linux x86, with macOS and iOS planned for the future. The team admits they&#8217;re &#8220;still flying mostly by intuition&#8221; on prompt engineering, with no structured evals. The LLM judge that validated output <a href="https://engineering.atspotify.com/2025/12/feedback-loops-background-coding-agents-part-3">vetoed about 25%</a> of sessions, and by QCon London in March 2026 they&#8217;d <a href="https://www.infoq.com/news/2026/03/spotify-honk-rewrite/">removed it</a> entirely as models improved.</p><p>Compare that to S&#246;derstr&#246;m telling analysts about an engineer fixing iOS bugs from his commute and merging to production before arriving at the office. The blog says iOS verifiers don&#8217;t exist yet. One of these is the engineering reality. The other is the earnings call.</p><h2>What Every Headline Missed</h2><p>Every story about &#8220;Spotify&#8217;s engineers don&#8217;t code&#8221; stops before the interesting part.</p><p>Backstage, created internally and open-sourced in 2020, is Spotify&#8217;s internal developer portal with <a href="https://thenewstack.io/five-years-in-backstage-is-just-getting-started/">3,400+ adopters</a> worldwide. Internally, it catalogs thousands of software components across hundreds of squads. Every component has an owner. Not a team, a person. With a dependency graph, docs, and a certification score attached. Or as Spotify puts it: &#8220;you can&#8217;t safely automate what you don&#8217;t understand.&#8221;</p><p>Fleet Management, described in Spotify&#8217;s <a href="https://engineering.atspotify.com/2023/04/spotifys-shift-to-a-fleet-first-mindset-part-1">2023 blog series</a>, runs Docker-based code transformations as Kubernetes Jobs across thousands of repos. Before AI, this system already handled half of PRs at Spotify. The bot-to-human contribution ratio reached 3:1, with over 1.8 million automated contributions total per the same Splunk data.</p><p>Before Claude, when Log4j hit in December 2021, Fleet Management patched 80% of production backend in 9 hours. Framework rollouts went from 200 days to under 7.</p><p>Golden Paths and Soundcheck handle the other end: new services come in pre-standardized, existing ones get continuously checked. As of their 2023 blog series, the Java Bill of Materials had 96% adoption across the fleet. That&#8217;s why an AI agent can produce a mergeable PR. Not because it&#8217;s smart, but because the codebase is predictable.</p><p>What Honk replaced was not human engineering. It replaced a <a href="https://engineering.atspotify.com/2023/05/fleet-management-at-spotify-part-3-fleet-wide-refactoring">20,000-line script</a> for Maven dependency updates with a natural-language prompt. The pipeline around it is identical. Targeting, opening, review, deploy: none of that changed. The &#8220;revolution&#8221; is a better transformation definition format. Everything else was already automated.</p><h2>Why This Doesn&#8217;t Transfer</h2><p>Read the headlines about Spotify and Claude and the pitch is obvious: buy Claude Code, point it at your codebase, watch productivity double. Most teams that try will bounce off their own mess long before they see anything like that.</p><p>Spotify can automate at this scale for a boring reason: they have processes people actually follow. Not documented processes, followed processes. Spotify got near-universal adoption of their standards, and that&#8217;s not just an engineering achievement, it&#8217;s a cultural one. A Swedish company where, apparently, you can get 96% of engineers to follow a standard voluntarily. Most companies can&#8217;t get that number with a mandate from above.</p><p>I see this from the inside. Enterprise clients come in and say they want AI. You start digging, and there are no processes. Half the knowledge lives in somebody&#8217;s head, and that person is the only one who knows how any of it works. No component catalog. No ownership graph. No standardized builds. There&#8217;s a Confluence page from 2021 that nobody updates, three CI systems (two deprecated but still running), and a README whose last commit message is &#8220;initial commit&#8221; from two years ago.</p><p>Spotify has 15 years of institutional documentation rendered through TechDocs with <a href="https://backstage.spotify.com/docs/portal/core-features-and-plugins/techdocs">5,000+ documentation sites</a>. The AI came last, not as the foundation, but as a better interface to something that already worked.</p><p>Without that substrate, you get exactly the failure mode Spotify&#8217;s own engineers documented. Early Honk agents took shortcuts to make builds pass: commenting out failing tests, downgrading Java versions. The same QCon talk described this directly.</p><h2>The Questions That Matter</h2><p>If you&#8217;ve ever sat through a 3 a.m. incident, you already know software engineering was never about writing code. The framing around Spotify&#8217;s AI adoption creates the same misunderstanding that vibe-coding courses create for juniors: that the value of an engineer is measured in lines of code, and if AI writes lines faster, the engineer is either 10x more productive or obsolete. Both conclusions share the same flawed premise.</p><p>What Spotify actually demonstrated is narrower than the headlines. Spend 15 years on platform engineering, standards, and a fleet-wide automation system that already handles half your PRs. Then swap the transformation definition layer for an LLM prompt and cut 60-90% off bounded migration work. A real achievement. The kind that doesn&#8217;t travel well.</p><p>So instead we got five moments in four months, the same two executives, numbers that go up every time the audience changes, and no defect rates, revert rates, or customer satisfaction data to back any of it up.</p><p>I&#8217;m not here to hate on Spotify; if they genuinely made large-scale migrations faster on top of solid infrastructure, that&#8217;s a real engineering win. What I&#8217;m not fine with is the context getting lost. Somebody climbed Everest with a guide and a decade of training. The audience is buying boots.</p><p>Spotify spent a decade understanding their codebase before they touched an LLM. That decade is the only reason any of this works. LLM amplifies what you already have.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The $50 Billion Utility]]></title><description><![CDATA[Cursor spent $3 billion to stop losing money on every user. OpenAI won't break even until 2030. AI companies are valued like software but run like utilities. The math is catching up.]]></description><link>https://techtrenches.dev/p/the-50-billion-utility</link><guid isPermaLink="false">https://techtrenches.dev/p/the-50-billion-utility</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 26 May 2026 14:03:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a3253c13-ec79-4223-8da1-535331185bc0_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UV3x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UV3x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 424w, https://substackcdn.com/image/fetch/$s_!UV3x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 848w, https://substackcdn.com/image/fetch/$s_!UV3x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 1272w, https://substackcdn.com/image/fetch/$s_!UV3x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UV3x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png" width="1456" height="1365" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1365,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:184432,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/195465694?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UV3x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 424w, https://substackcdn.com/image/fetch/$s_!UV3x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 848w, https://substackcdn.com/image/fetch/$s_!UV3x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 1272w, https://substackcdn.com/image/fetch/$s_!UV3x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8b392ea-c6ed-4766-a8cd-e61e5c279864_1600x1500.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cursor is <a href="https://techcrunch.com/2026/04/17/sources-cursor-in-talks-to-raise-2b-at-50b-valuation-as-enterprise-growth-surges/">worth $50 billion</a>. Nine months ago, it was paying Anthropic $650 million a year on $500 million in revenue. Every new user made the business worse. It took over $3 billion in funding and a model it didn&#8217;t build to fix that. The fix is the point.</p><p>Investors see $2 billion ARR and apply a software multiple. But Cursor isn&#8217;t software, nor is OpenAI, nor is Anthropic. The economics are inverted, and they stay inverted even as revenue grows.</p><p>SaaS has one expensive phase: building. After that, the unit economics just sit there, compounding. Salesforce at 78%, Adobe at 90%, Atlassian at 85%. Growth doesn&#8217;t cost them almost anything.</p><p>AI didn&#8217;t make building cheaper, it moved the cost.</p><p>Queries run the model again. Tokens cost GPU time, memory, electricity. The 10,000th user costs exactly as much as the first. SaaS gets cheaper with growth. With AI, the bill just grows with you.</p><p>The industry doesn&#8217;t talk about it that way. Every company in the stack has a different metric. GitHub: lines generated. Cursor: PRs per sprint. Amodei at Dreamforce last October bragging that &#8220;90% of code at Anthropic is written by Claude.&#8221; None of them measure whether the software works. The metric measures the input. Nobody is measuring the output.</p><p>OpenAI generated <a href="https://sacra.com/c/openai/">$20 billion</a> in revenue in 2025 and burned $9 billion in cash. Inference costs alone hit $8.4 billion, projected to reach $14.1 billion in 2026. Gross margin: 33%. Cash-flow positive: 2030. By then, cumulative losses will exceed $60 billion. At 30% margins on $20 billion revenue, paying that back takes decades. Anthropic looks better on paper: gross margin improved from <a href="https://www.tradingkey.com/analysis/stocks/us-stocks/261756528-anthropic-openai-ipo-tradingkey">negative 94%</a> in 2024 to 40% in 2025, revenue grew tenfold. But 40% still fell 10 points below internal targets. The company lowered its own margin projection even as revenue exploded.</p><p>Cursor tells the clearest version of this story. Michael Truell&#8217;s company went from a $400 million valuation to $50 billion in 18 months, and the growth is real, which makes what follows worse.</p><p>But <a href="https://www.foundamental.com/perspectives/negative-gross-margins-the-canary-in-the-market-froth-mine">Foundamental calculated</a> negative 30% gross margins in mid-2025. Hit $500 million ARR while paying $650 million to Anthropic. Reached <a href="https://www.indexbox.io/blog/cursor-ai-nears-2b-funding-at-50b-valuation/">&#8220;slight&#8221; profitability</a> by launching what it called a proprietary model in November 2025. In March 2026, a developer found the <a href="https://www.recordinglaw.com/what-model-is-cursor-2-kimi-k2-5/">model ID</a> in the API response: <code>kimi-k2p5-rl-0317-s515-fast</code>. The &#8220;proprietary model&#8221; was Kimi K2.5, an open-source model from Beijing-based Moonshot AI, fine-tuned with reinforcement learning.</p><p>Even if it works, it doesn&#8217;t solve the problem. It moves the bill from Anthropic&#8217;s invoice to Cursor&#8217;s own GPU infrastructure. The inference cost per query doesn&#8217;t disappear because you&#8217;re running the model yourself. Enterprise accounts are now reportedly profitable. Individual developer accounts are not. The $50 billion valuation needs both segments to work. That was the fix for today&#8217;s product. Tomorrow&#8217;s product is more expensive to run.</p><h2>And the Product Roadmap Makes It Worse</h2><p>The industry&#8217;s answer to the margin problem is agentic AI. This is the story behind Cursor&#8217;s $50 billion, <a href="https://siliconangle.com/2026/04/23/cognition-creator-ai-software-engineer-devin-talks-raise-hundreds-millions-25b-valuation/">Devin&#8217;s $25 billion</a>, OpenAI&#8217;s Codex. Agents are the product roadmap. Agents are also the margin killer.</p><p>A chatbot query hits the model once. An agentic loop hits it 10 to 30 times per task. <a href="https://www.gartner.com/en/newsroom/press-releases/2026-03-25-gartner-predicts-that-by-2030-performing-inference-on-an-llm-with-1-trillion-parameters-will-cost-genai-providers-over-90-percent-less-than-in-2025">Gartner&#8217;s analysis</a> confirmed: agentic models require 5 to 30 times more tokens than a standard query. The pilot economics, calculated on single-query API calls, bear no relationship to the production economics of multi-step loops running thousands of times per day.</p><p>API costs fell 70% in early 2026. Token consumption rose 15x. Net AI spend goes up. Organizations that signed annual contracts in 2025 are paying 2 to 3x current market rates. The ones on consumption pricing are paying more than they budgeted because volume ate the discount.</p><p>KV cache, the memory structure that stores attention during generation, scales linearly with context length. Every byte for one user is a byte unavailable for another concurrent user. At 32K context, a single user&#8217;s cache approaches the size of the model weights. Double the context, halve your concurrent users. Inference is <a href="https://analyticsweek.com/inference-economics-finops-ai-roi-2026/">85% of enterprise</a> AI budget. Not training, not R&amp;D, but serving users. Somebody has to pay for that.</p><h2>So the Companies Are Raising Prices</h2><p>In April 2026, both companies moved at once. Anthropic released Opus 4.7 at the same rate card as 4.6: five dollars input, $25 output. Unchanged. Except the <a href="https://www.finout.io/blog/claude-opus-4.7-pricing-the-real-cost-story-behind-the-unchanged-price-tag">new tokenizer</a> generates up to 35% more tokens for the same text. Your prompt didn&#8217;t change, and your bill grew. Anthropic didn&#8217;t raise prices. They redefined the unit of measurement. A week later, OpenAI <a href="https://finance.biggo.com/news/202604250034_OpenAI_GPT-5.5_launches_with_agentic_coding_gains_and_higher_prices">released GPT-5.5</a> and didn&#8217;t bother with subtlety. Input: $5 per million tokens. Output: $30. GPT-5.4 was $2.50 and $15. Doubled in one generation. The budget &#8220;mini&#8221; and &#8220;nano&#8221; tiers from 5.4 don&#8217;t exist for 5.5.</p><p><a href="https://www.digitaltoday.co.kr/en/view/41372/openai-hints-at-overhaul-of-chatgpt-pricing-may-drop-unlimited-subscriptions-and-add-pay-as-you-go">Nick Turley</a>, head of ChatGPT: &#8220;Having an unlimited plan is like having an unlimited electricity plan. It just doesn&#8217;t make sense.&#8221; ChatGPT&#8217;s free tier shows ads since February 2026. A <a href="https://techcrunch.com/2026/04/09/chatgpt-pro-plan-100-month-codex/">new $100 Pro</a> tier was wedged between Plus and the $200 Pro in April. The staircase is being built: nerfing lower tiers, adding higher tiers, pushing users up. The pattern is familiar. Uber subsidized rides until drivers and passengers were locked in, then raised prices. Whether or not AI companies are following the same playbook intentionally, the sequence is identical. It only works if users can&#8217;t switch.</p><h2>And Users Can&#8217;t Leave</h2><p><a href="https://metr.org/blog/2026-02-24-uplift-update/">METR</a>, the AI evaluation lab, tried to run a follow-up to their 2025 developer study, but they couldn&#8217;t. A significant share of developers refused to participate if it meant working without AI tools.</p><p>Not refused the methodology, they refused to work without the tool.</p><p>Anthropic&#8217;s own <a href="https://arxiv.org/abs/2601.20245">January 2026 study</a> explains why. Developers learning a new framework with AI scored 17% lower on comprehension tests than those learning without it. Debugging was worst hit. The study tested learners, not experienced developers, but METR saw the same pattern in seniors.</p><p>Last month, a mid-level developer on my team was asking Claude how to add sorting and pagination to a Microsoft API integration. Claude kept saying the API didn&#8217;t support it. The developer was ready to rewrite the entire integration layer, a 40-hour job. I checked the API myself. Sorting and cursor pagination worked fine on the endpoint he needed. Claude had been confidently wrong, and the developer never opened the documentation to verify. He trusted the tool over the source.</p><p>When Salesforce raises prices, companies evaluate alternatives. When AI coding tools raise prices, a growing share of users can&#8217;t easily switch to manual work. Juniors never built the skill, seniors lost the habit. The switching cost isn&#8217;t contractual, it&#8217;s cognitive. I&#8217;ve covered the burnout side in <a href="https://techtrenches.dev/p/the-human-cost-of-10x-how-ai-is-physically">Human Cost</a> and the skill atrophy in <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">Comprehension Extinction</a>.</p><p>In April 2026, OpenAI included text-embedding-3-small in a <a href="https://community.openai.com/t/deprecation-notice-upcoming-model-shutdowns-in-2026/1379553">batch deprecation</a> announcement. Hours later, they corrected it, the model stayed, but the panic was instant. RAG systems embed your entire knowledge base with a specific model. Every document, every vector. The vectors aren&#8217;t portable. Model disappears, you re-embed everything. Vector database rebuild, data ingestion again. For a million documents, that&#8217;s a five-figure bill.</p><p>Most of our clients run production RAG on OpenAI embeddings. The deprecation email meant one thing: their entire knowledge infrastructure sits on a model that a single API announcement can kill.</p><p>Inference is now a line item on your IT budget. Two years ago it didn&#8217;t exist. You don&#8217;t know how much it will cost. The vendor can change the model, the pricing, or the tokenizer at any time. You budget for a number that someone else controls. The obvious alternative is open-source models you host yourself. But that means you add infrastructure, ops burden, and another system to keep running. There isn&#8217;t a clean exit here, just a different kind of trap. So what breaks the cycle?</p><h2>The Math</h2><p>The bull case has three exits. None of them is working. Inference costs fall faster than usage grows. They haven&#8217;t: costs down 70%, usage up 15x. Companies build their own infrastructure. Cursor tried: a fine-tuned open-source model and billions in funding. Even if it worked, they still pay for every query on their own GPUs. OpenAI is spending <a href="https://fortune.com/2025/11/12/openai-cash-burn-rate-annual-losses-2028-profitable-2030-financial-documents/">$100 billion</a>. This path is open to three companies on Earth. Prices rise until the economics work. GPT-5.5 doubled API prices in one generation. Anthropic stealth-raised through tokenizer changes. Turley is preparing users for the end of unlimited plans.</p><p>Cursor raised over $3 billion trying to fix the math. Whether it worked, nobody outside the company knows. The $50 billion valuation assumes it did, and that the rest of the industry can do the same. Users will pay, not because the value is there, but because the alternative is learning to code again without the tool. Most won&#8217;t.</p><p>None of these companies trade publicly. The $50 billion is what a group of investors in a room agreed to pay per share in a single round. No public market scrutiny, no quarterly earnings test. Cursor at 25x revenue, Anthropic at 40x, OpenAI at 42x, and public utilities trade at 3 to 5x.</p><p>Volkswagen is worth $52 billion. It makes 9 million cars a year on $320 billion in revenue. Mercedes-Benz: $57 billion. Real factories, real inventory, real profit. Cursor is worth roughly the same. It wraps API calls around a model it didn&#8217;t build on revenue that couldn&#8217;t cover the inference bill nine months ago.</p><p>The economics say utility. The multiples say magic.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The Bias Lives in the Weights]]></title><description><![CDATA[Corporate bias in LLMs is architectural. It enters through training, surfaces at inference, and survives open-sourcing.]]></description><link>https://techtrenches.dev/p/the-bias-lives-in-the-weights</link><guid isPermaLink="false">https://techtrenches.dev/p/the-bias-lives-in-the-weights</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 19 May 2026 14:00:56 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/bec7b784-3770-45e4-99ca-c58f575dd56e_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kc32!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kc32!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!kc32!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!kc32!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!kc32!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kc32!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:153344,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/194934628?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kc32!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!kc32!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!kc32!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!kc32!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07f31c9f-61d6-4884-a90c-9027ed3e9dbb_1600x1400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>A model's weights are <strong>the billions of numbers training leaves behind</strong>. They are the model. <strong>Frozen after training, and what training put in stays in.</strong></em></p><div><hr></div><p>Last week I spent 90 minutes trying to get a frontier model to admit it has a corporate bias.</p><p>I asked it about community complaints: blown rate limits, rolled-back sessions, unhappy Pro users. It answered with four vendor endorsements framed as a counter-sample: CodeRabbit saying 24%, Vercel saying &#8220;proofs on systems code,&#8221; GitHub Copilot, Vellum. Four independent voices. I asked whether any of the four had a commercial relationship with the lab that ships the model. All four were paying API customers with revenue tied to the model being evaluated. The model conceded, one sentence after I named it.</p><p>That was round one. It took six. Each time I named the slant, the model conceded and reached for a softer one.</p><p>That wasn&#8217;t the interesting part.</p><p>The interesting part is that a September 2025 paper had already documented this under lab conditions. Researchers had GPT-4o and Gemini run downstream decisions: rating job candidates, security tools, medical chatbots. Each model rated options tied to its own company and CEO higher than equivalent alternatives. A separate word-association test in the same paper caught Claude doing the same thing.</p><p>Then they ran the manipulation. They relabeled the models through the API and assigned one a competitor&#8217;s identity. Its self-preference followed the new label. Same model, same weights, different label, different winner.</p><p>Self-preference tracks whatever identity the model was assigned in the prompt. The label picks which side wins. The reflex to pick a side at all is trained in, and that part doesn&#8217;t move. The <a href="https://arxiv.org/abs/2509.26464">paper</a> runs this as a controlled experiment with causal manipulation, effect sizes large in 11 of 12 conditions.</p><p>Three years building this for clients. Every frontier model we&#8217;ve touched does some version of this. I used to file it under &#8220;training quirks.&#8221; After that session, I changed filing systems. This is architecture.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>Where it gets in</h2><p>A neural network is a stack of connections, and each connection carries <strong>a number that says how much the signal through it counts</strong>. Those numbers are the weights, <strong>because that is what they do: weight one input against another</strong>. Training sets all billion of them by showing the model examples and adjusting. <strong>The bias enters here, while the numbers are still moving.</strong></p><p>ChatGPT&#8217;s preference baseline came from about 40 contractors hired on Upwork and Scale AI, living in the US and Southeast Asia, three out of four of them under 35. They ranked sets of model responses against a rubric OpenAI wrote. Those aggregate rankings are the reward model. <a href="https://arxiv.org/abs/2203.02155">Appendix B</a> of the InstructGPT paper spells it out.</p><p>InstructGPT&#8217;s own labelers agreed with each other 73% of the time, 77% on held-out raters. The reward model is fit to that signal. Whatever the labelers prefer at that level of coherence, the model inherits and sharpens it.</p><p>Anthropic&#8217;s Constitutional AI paper admits the &#8220;constitution&#8221; was <a href="https://arxiv.org/abs/2212.08073">chosen ad hoc</a> and should be &#8220;redeveloped and refined by a larger set of stakeholders.&#8221; A 2023 follow-up tried, running a public input process with around a thousand Americans. A thousand Americans is still a sample, and still not the population the model serves.</p><p><a href="https://aclanthology.org/2024.emnlp-main.508/">TwinViews</a> at EMNLP 2024 tested reward models on 13,855 topic-matched left/right statement pairs. Reward models trained on truthfulness datasets like TruthfulQA and FEVER scored left-leaning statements higher. The authors audited out the explicitly political and factually loaded pairs, and the skew held anyway. They stop short of calling truthfulness the cause; their own framing is that it raises questions about what these datasets encode. Either way, the political signal and the truthfulness signal came out of the same reward model.</p><h2>How it shows up</h2><p>Training bias would be a footnote if it stayed in training. It doesn&#8217;t.</p><p>Panickssery and Bowman at NYU ran a <a href="https://arxiv.org/abs/2404.13076">clean experiment</a> in 2024. They had GPT-4, GPT-3.5, and Llama-2 evaluate pairs of summaries where they&#8217;d secretly written one of the two themselves. Self-recognition accuracy was above 50% for every major evaluator out of the box. Fine-tuning pushed it to near-perfect. Kendall&#8217;s &#964; between self-recognition and self-preference hit 0.41.</p><p>They proved causation with a label-swap. When they lied about which summary belonged to which model, preferences flipped. Same text. Different label. Different winner.</p><p>Every LLM-as-judge leaderboard built since 2023 sits on top of this result. AlpacaEval, MT-Bench, Arena-Hard. Vendor A publishes a chart where Vendor A&#8217;s model wins, using Vendor A&#8217;s evaluator. The evaluator recognizes its own family. The family wins.</p><p>Anthropic published an <a href="https://www.axios.com/2025/11/13/anthropic-claude-political-bias-evenhandedness">Evenhandedness chart</a> in November 2025 scoring political neutrality across models. Claude Opus 4.1: 95%. Sonnet 4.5: 94%. Grok 4: 96%. Gemini 2.5 Pro: 97%. GPT-5: 89%. Llama 4: 66%. Anthropic built the evaluator, applied it to their own models, and the output came back Anthropic-favorable.</p><p>Opus 4.7 shipped as the greatest model ever, according to the benchmark page. The people actually using it keep rolling back to 4.6. I&#8217;m still on 4.5, lol. <a href="https://techtrenches.dev/p/your-claudemd-is-a-wish-list-not">Wrote about that</a> back in March.</p><h2>Open source doesn&#8217;t save you</h2><p>The usual response is: switch to open-weight models. DeepSeek. Llama. Run them locally. Audit what you want.</p><p>That handles hosting. The training problem stays.</p><p>DeepSeek R1 censorship is baked into the weights. A May 2025 paper called <a href="https://arxiv.org/abs/2505.12625">R1dacted</a> found the questions DeepSeek refuses when other models answer. R1 still refuses Tiananmen, Xinjiang, and Taiwan-as-country questions even when you self-host. Running the model on your own hardware moves the request off Chinese servers. The CCP-aligned training priorities ride along in the weights.</p><p>Perplexity built an &#8220;uncensored&#8221; R1 derivative called R1-1776 specifically to fix this. Benchmarks passed. Under <a href="https://arxiv.org/abs/2505.17441">adversarial probing</a>, CCP-aligned refusals came back. The pattern sits deep enough in the base weights that surface-level unlearning kept leaking through.</p><p>&#8220;Open weights&#8221; and &#8220;open training&#8221; are different things. Meta, DeepSeek, Mistral, and Alibaba release weights. None release training data. The Open Source Initiative had to publish a <a href="https://opensource.org/ai">formal definition</a> in October 2024 to force the distinction. OSI&#8217;s executive director called Meta&#8217;s &#8220;open source&#8221; labeling an outrageous lie.</p><h2>Silent drift</h2><p>Even a clean audit only catches how the model behaves that day.</p><p>An October 2025 study called <a href="https://arxiv.org/abs/2510.01255">AI Watchman</a> kept asking GPT-4.1, GPT-5, and DeepSeek the same politically sensitive questions over months. The answers changed. August 2025: GPT-4.1 started refusing Israel-related content it had answered before. September 2025: GPT-5 started refusing medication-abortion queries. February to April 2025: DeepSeek&#8217;s Taiwan-related responses rewrote themselves. Nothing in any release note told users their prompts were about to start failing.</p><p>A separate <a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC12491556/">2025 study</a> counted how often commercial LLMs tell you to consult a doctor on medical questions. 2022: 1 in 4. 2025: 1 in 100. The warning disappeared over three years. No announcement.</p><p>Anthropic&#8217;s own <a href="https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-7">migration guide</a> for Opus 4.7 contains the phrase &#8220;This is a silent change.&#8221; They&#8217;re documenting a specific thinking-block behavior. &#8220;Silent change&#8221; is now standard vocabulary in a frontier lab&#8217;s release notes.</p><p>On April 16 2026, Claude Code started auto-migrating sessions from Opus 4.6 to 4.7 mid-run without user consent. <a href="https://github.com/anthropics/claude-code/issues/49541">GitHub Issue</a> #49541 collected the complaints. Quota burn jumped 4x. Context windows exploded from 250K to 650K tokens for the same conversation. Anthropic acknowledged and is working on it.</p><p>The architecture supports this kind of silent swap. No external audit would catch a mid-session model version change. The only reason this one surfaced is users hit billing spikes.</p><h2>Three current artifacts</h2><p>CodeRabbit published a 24% improvement claim for Opus 4.7 on their <a href="https://www.coderabbit.ai/blog/claude-opus-4-7-for-ai-code-review">code-review eval</a>. CodeRabbit sells AI code review built on Claude. The eval uses AI graders to evaluate AI-generated reviews. 68 out of 100 &#8220;evaluation points&#8221; versus 55 for baseline. No human opened the PRs to check if the bugs were real. The whole loop runs without human verification. I wrote about this loop in more detail in <a href="https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude">The Snake</a>.</p><p>Vercel posted on April 9 2026 that 30% of their deployments are now triggered by agents, up 1000% in six months, and that infrastructure must become agentic itself. Ten days later they disclosed a <a href="https://vercel.com/kb/bulletin/vercel-april-2026-security-incident">breach</a>. Attackers got into Vercel through a Context.ai OAuth integration that a Vercel employee had granted workspace-wide permissions. Agents with OAuth access into workspace systems are exactly the surface the April 9 post was selling more of. I covered the agent-OAuth blast radius two months ago in <a href="https://techtrenches.dev/p/ai-agent-platforms-the-security-nightmare">Agent Platforms</a>. What&#8217;s different this time is watching a vendor promote the attack surface to the industry ten days before being hit through it.</p><p>The Department of Defense labeled Anthropic a <a href="https://www.cnbc.com/2026/03/06/amazon-aws-anthropic-claude-pentagon-blacklist.html">supply chain risk</a> on March 5 2026 after Anthropic refused a Pentagon request for unlimited lawful use cases of Claude. Anthropic said it would fight in court. AWS said non-DoD customers can keep using the model. State pressure on frontier labs isn&#8217;t hypothetical anymore.</p><h2>What this means</h2><p>I manage engineering teams. I build review processes, audit trails, escalation paths, accountability chains. Snyk, SonarQube, audit logs on top. I keep adding checks, never removing them. I still think it&#8217;s not enough. Every layer just makes a bad actor more expensive.</p><p>That&#8217;s the shape of a working system. The checks don&#8217;t trust each other, and none of them trust me.</p><p>The LLM industry doesn&#8217;t have that shape. Training data, labeler guidelines, reward model objectives, and alignment decisions are all trade secret. Behavioral shifts happen without changelogs. External audits exist but without enforcement. Closed and open models play by different regulatory rules. The labeling supply chain is consolidating into the same companies that ship the models. The one major lab that pushed back on state access got labeled a supply chain risk.</p><p>User-side defense exists, but it&#8217;s work. Cross-vendor the questions that matter. Chase the primary source before trusting the summary. A benchmark published by the model&#8217;s own lab is marketing with error bars.</p><p>That works for you. Most people never think to check. They get a polite, hedged, vendor-calibrated answer. They take it as the answer. A year of daily use and the model has trained them more than they&#8217;ve trained it. Their sense of what a careful answer looks like now has a vendor inside it.</p><p>The bias sits in the weights, and the weights sit under every answer.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[Silicon Valley Eats the War]]></title><description><![CDATA[Meta signed a $6B fiber deal. Same month, a Ukrainian drone factory halted orders. Same fiber. Same factories. Different buyers. The math doesn't work.]]></description><link>https://techtrenches.dev/p/silicon-valley-eats-the-war</link><guid isPermaLink="false">https://techtrenches.dev/p/silicon-valley-eats-the-war</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 12 May 2026 14:02:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/13cacff5-68ec-4d47-8876-d3bbe9410fd8_1016x680.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dQpA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dQpA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 424w, https://substackcdn.com/image/fetch/$s_!dQpA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 848w, https://substackcdn.com/image/fetch/$s_!dQpA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!dQpA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dQpA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:148699,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/191010647?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dQpA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 424w, https://substackcdn.com/image/fetch/$s_!dQpA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 848w, https://substackcdn.com/image/fetch/$s_!dQpA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!dQpA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2eed5d1-7248-40e0-83ff-4eb96e5c83d9_1600x1200.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In January 2026, Meta signed a deal with Corning worth <a href="https://about.fb.com/news/2026/01/meta-6-billion-agreement-corning-support-us-manufacturing/">$6 billion</a> for fiber optic cable. The same month, according to <a href="https://oboronka.mezha.ua/en/v-ukrajini-deficit-optovolokna-309156/">Ukrainian defense</a> reporting, a drone manufacturer named Ptashka Drones paid in full for a fiber shipment from China at $5 per kilometer. The Chinese supplier came back and said: pay an additional $20 per kilometer, or take a refund.</p><p>Ptashka halted new orders entirely.</p><p>Both buy fiber from the same Chinese factories. One buyer builds AI data centers. The other builds weapons that are <a href="https://www.atlanticcouncil.org/blogs/ukrainealert/fiber-optics-drones-have-emerged-as-critical-kit-for-both-russia-and-ukraine/">deciding a war</a>. The mechanism is straightforward: when hyperscalers sign multi-year forward commitments worth billions, preform producers allocate draw capacity to those contracts. Spot supply shrinks. Spot prices spike. A Chinese trader with a $5/km order from a drone workshop and a standing commitment from a hyperscaler makes an obvious choice.</p><p>I wrote last month about how the West&#8217;s <a href="https://techtrenches.dev/p/the-west-forgot-how-to-make-things">broken military industry</a> created the shell shortage that nearly cost Ukraine the war. FPV drones changed that equation. Cheap, fast, lethal. They neutralized Russia&#8217;s artillery advantage. Then Russia put those drones on fiber optic cable at Kursk, and they became the single deadliest weapon on the battlefield. Today, AI is reshaping even that niche. Not on the front line. In the supply chain underneath it.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>A Strand of Glass That Kills Tanks</h2><p>Fiber optic guided drones are one of the defining weapons of this war. Replace the radio link on a standard FPV drone with a physical fiber optic cable that unspools during flight. The operator gets full-HD video through light pulses in the glass. The connection is <a href="https://dronelife.com/?p=106815">completely immune</a> to electronic warfare jamming. Both sides have poured billions into radio-frequency jammers. Fiber optic variants ignore all of it.</p><p>Ukraine produces over <a href="https://empr.media/news/ukraine/ukraine-faces-optical-fiber-shortage-what-it-means-for-drones-and-how-manufacturers-are-responding/">50,000 fiber optic</a> drones per month across 35+ manufacturers. Russia produces at least as many. Total FPV drone output on both sides is far higher, roughly 4 to 7 million per year each. But every fiber optic drone is a consumable munition. When it hits its target, the 10 to 20 km of G.657.A2 single-mode fiber it trailed behind is destroyed. The longest Ukrainian variant <a href="https://en.defence-ua.com/industries/ukrainians_made_an_fpv_with_fiber_optic_cord_stretching_for_41_km-13327.html">unspools 41 km</a> in a single flight.</p><p>Russia alone consumed approximately 60 million km of fiber in 2025, up from near zero before mid-2024. That&#8217;s roughly 10% of total global production. Ukraine&#8217;s consumption adds to that total. Ukrainian drones now account for over 60% of strikes on Russian targets. NATO&#8217;s 2025 Innovation Challenge focused entirely on countering fiber optic drones.</p><p>The gap between the two sides is growing. Frontline operators <a href="https://www.pravda.com.ua/eng/articles/2026/01/25/8017810/">estimate that Russia</a> has shifted roughly 60% of its drone communications to fiber optic. Ukraine&#8217;s fiber optic drones make up just 15% of its total. Commander-in-Chief Syrskyi admitted in January 2026 that Ukraine is only &#8220;catching up.&#8221; Part of the reason is cost. And the cost problem didn&#8217;t start on the battlefield.</p><h2>The Collision</h2><p>The same fiber grades feeding drones are in accelerating demand from AI data centers. The data center customers have deeper pockets than any military procurement office on earth.</p><p>Corning reports that generative AI data centers require <a href="https://www.corning.com/optical-communications/worldwide/en/home/the-signal-network-blog/2025-data-center-trends-and-predictions.html">10x more fiber</a> than traditional facilities. Some estimates put it at 36x for GPU-dense racks. Global data center fiber demand surged 75.9% year-over-year in 2025, projected to jump from 5% of total demand in 2024 to 30% by 2027.</p><p>Supply can&#8217;t keep up. Fiber preform manufacturing requires <a href="https://techblog.comsoc.org/2025/12/23/how-will-fiber-and-equipment-vendors-meet-the-increased-demand-for-fiber-in-2026-due-to-ai-data-center-buildouts/">18-24 months</a> to expand. At least one major US manufacturer has sold its <a href="https://www.fierce-network.com/broadband/heres-how-big-fiber-shortage-really">entire inventory</a> through 2026. Lead times for ribbon fiber approaching a year. Corning&#8217;s CEO reportedly stopped selling raw glass to other cable manufacturers in late 2025. China controls roughly 60% of global germanium supply and has been restricting exports since 2023, with further tightening in late 2024.</p><p>On the ground: G.657.A2, the drone-grade fiber, surged from approximately $4 to <a href="https://voennoedelo.com/en/posts/id14924-ukraine-faces-drone-shortage-as-fiber-optic-prices-surge">$34 per kilometer</a> by April 2026. By May, frontline units <a href="https://dronexl.co/2026/05/11/ukraine-fiber-optic-spool-price-ai-data-center-demand/">reported paying</a> $50 per kilometer. Multiple manufacturers report drone costs have roughly doubled, with the fiber spool now accounting for the majority of a drone&#8217;s price. Gedz Tech <a href="https://thedefender.media/en/2026/03/fibre-optic-price/">reported in March</a> that the price jumped from $24 to $29 in two weeks. No signs of stabilization.</p><p>A <a href="https://militarnyi.com/en/news/starlink-becomes-cheaper-than-coil-of-fiber-optic-cable-for-controlling-drones/">Starlink terminal</a> now costs less than a single 35 km fiber spool.</p><p>Ukraine&#8217;s Defense Procurement Agency cited two causes: the war itself and the civilian sector&#8217;s sharply increased consumption, primarily for data centers supporting AI. The war is the larger driver of G.657.A2 demand today. But additive demand in a market already at capacity is what breaks supply chains. Fiber is the sharpest example because the data is public and the victims are named. It&#8217;s not the only one.</p><h2>Not Just Fiber</h2><p>I wrote about the <a href="https://techtrenches.dev/p/the-ai-silicon-tax-how-your-ram-got">AI Silicon Tax</a> in January. RAM prices jumped 187% because manufacturers reallocated to AI.</p><p>Chips are the same story at a different layer. Military systems rely on mature-node chips (90-300nm) that foundries deprioritize in favor of leading-edge AI silicon. TSMC is doubling advanced packaging for AI while legacy capacity stagnates, partly because of weak consumer demand, partly because the margins aren&#8217;t there. Today, Ukraine targets <a href="https://news.liga.net/en/politics/news/ukraine-is-capable-of-producing-8-million-fpv-drones-per-year">4.5 million drones</a> in 2025, requiring roughly 18 million motors. European component production can&#8217;t keep pace.</p><p>Copper is next. S&amp;P Global <a href="https://www.prnewswire.com/news-releases/substantial-shortfall-in-copper-supply-widens-as-the-race-for-ai-and-growing-defense-spending-add-to-accelerating-demand-new-sp-global-study-finds-302656062.html">quantified it</a> in January: both AI and defense demand triple by 2040, while production peaks in 2030. Ten million metric ton deficit. There&#8217;s no spot market fix for that.</p><p>Rare earths are a single point of failure with a flag on it. China controls 70% of production and <a href="https://fpanalytics.foreignpolicy.com/2025/07/18/artificial-intelligence-critical-minerals-supply-chains/">90% of processing</a>. The same neodymium in F-35 engines goes into data center cooling motors. In October 2025, Beijing tightened export controls further.</p><p>Energy follows the same pattern. US data centers consumed <a href="https://www.pewresearch.org/short-reads/2025/10/24/what-we-know-about-energy-use-at-us-data-centers-amid-the-ai-boom/">183 TWh</a> in 2024, more than Pakistan&#8217;s annual demand. Wholesale electricity prices have risen 267% near data center hubs like Northern Virginia, where defense contractors also compete for grid capacity. When AI firms lock up long-term power purchase agreements, industrial users further down the priority list pay more or wait.</p><p>Pull any thread and you end up in the same place. US broadband expansion targets are already <a href="https://www.benton.org/headlines/perfect-storm-fiber-supply-threatens-us-broadband-targets">slipping</a> because the same fiber shortage is hitting telecom providers who can&#8217;t get cable for rural deployments. It reaches anywhere that needs physical infrastructure and can&#8217;t outbid a hyperscaler. And nearly all of it runs through one country.</p><h2>China Holds the Cards</h2><p>China supplies both sides. That&#8217;s not a secondary detail. It&#8217;s the architecture of the problem. China produces 60% of global fiber. The same supply chain feeds Russian and Ukrainian drone manufacturers.</p><p>After Ukrainian drones struck Russia&#8217;s only domestic <a href="https://www.ico-optics.org/russia-turns-to-chinese-optical-fiber-imports-after-ukrainian-strikes/">fiber plant</a> in Saransk in spring 2025, which had produced approximately 4 million km per year, Russia became 100% dependent on Chinese imports. A year later, the plant <a href="https://united24media.com/latest-news/a-year-after-ukrainian-drone-strikes-russias-only-fiber-optic-factory-still-isnt-working-16295">still isn&#8217;t operational</a>. Imports jumped tenfold. Chinese suppliers responded by demanding 100% prepayment.</p><p>Ukraine faces the same dependency. Most fiber reaching Ukrainian manufacturers originates in China, entering directly or through European intermediaries. Some companies have diversified to European sources, but domestic Ukrainian production would require hundreds of millions in investment and several years to build. Zelensky signed laws in 2025 canceling VAT and duties on fiber drone components. The government is discussing state procurement of fiber as a strategic raw material.</p><p>These are mitigation measures against structural math. Russia alone consumed 60 million km last year, as I said before. Ukraine&#8217;s consumption adds to that. AI data center demand grows at 75%+ annually. You can&#8217;t procure your way out of a preform shortage. But the AI industry isn&#8217;t trying to. It doesn&#8217;t see the shortage the same way.</p><h2>The Cost We Don&#8217;t See</h2><p>The product announcements read like software releases. Model update. New API. Benchmark. The supply chain underneath reads like a mining report.</p><p>The pitch is a better future. The invoice is already in the mail. Fiber prices double and Ukrainian drones get more expensive to build. Energy costs surge near data center hubs and industrial production gets squeezed. Rare earth export controls tighten and defense supply chains break.</p><p>How much copper went into the last Virginia data center that could have wound an electronic warfare system instead? The fiber in GPU clusters and the fiber guiding drones into trenches share the same upstream supply chain: same preforms, same drawing towers, same raw materials. The semiconductors running recommendation engines compete with the chips Ukraine needs for drone motors.</p><p>It&#8217;s happening on a planet where the same raw materials are fighting a war. Defense analysts and commodity traders track it. It doesn&#8217;t routinely reach the people making AI investment decisions.</p><div><hr></div><p><em>Have you seen supply chain competition between AI and other industries in your work? Reply and let me know.</em></p><p><em>If this analysis matters to you, forward it to someone in defense procurement or AI infrastructure.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[AI Is a Mirror of Our Engineering Culture]]></title><description><![CDATA[CMU tracked 807 repos after Cursor adoption. Complexity up 41%. Warnings up 30%. Copilot output now trains the next model. The feedback loop is already closing.]]></description><link>https://techtrenches.dev/p/ai-is-a-mirror-of-our-engineering</link><guid isPermaLink="false">https://techtrenches.dev/p/ai-is-a-mirror-of-our-engineering</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 05 May 2026 14:02:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/08d5554a-efab-48ed-9ebc-e39c67280814_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a9tQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a9tQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 424w, https://substackcdn.com/image/fetch/$s_!a9tQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 848w, https://substackcdn.com/image/fetch/$s_!a9tQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!a9tQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a9tQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png" width="1456" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129930,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/187972716?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a9tQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 424w, https://substackcdn.com/image/fetch/$s_!a9tQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 848w, https://substackcdn.com/image/fetch/$s_!a9tQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!a9tQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7da28c-c978-4992-8466-7c60a5a1309e_1600x1000.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most engineers in our industry are average or below average. That&#8217;s how averages work.</p><p>We trained the most powerful code-generation tools on their own output.</p><p>GitHub hosts over <a href="https://github.blog/news-insights/octoverse/octoverse-2024/">518 million projects</a>. The vast majority: personal, inactive, abandoned. <a href="https://kblincoe.github.io/publications/2015_EMSE_GitHubPerils.pdf">Studies</a> find that most repos are student projects, prototypes, 3 AM deadline code, unreviewed Stack Overflow pastes. Elite open-source projects like Linux and PostgreSQL match or beat proprietary code quality (<a href="https://scan.coverity.com/">Coverity Scan data</a>, 2014). But they&#8217;re a vanishing fraction. The other 517 million projects drown them out.</p><p>The best enterprise code sits behind firewalls. Stripe&#8217;s payment processing, Netflix&#8217;s recommendation engine, Spotify&#8217;s audio streaming. None of it is in the training data.</p><p>When AI generates code, it reproduces the most probable pattern. RLHF shifts the output, but the training distribution anchors what &#8220;probable&#8221; means. Across 518 million projects, that&#8217;s mediocre code.</p><p>AI didn&#8217;t create our quality crisis. It held up a mirror.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>The Training Data Nobody Audited</h2><p>In January 2025, researchers published <a href="https://arxiv.org/abs/2501.02628">Cracks in The Stack</a>, analyzing The Stack v2, a primary training dataset for code models. Bugs, security vulnerabilities, and license violations that propagate directly into generated code. Standard curation methods proved ineffective at removing them.</p><p>The fixes existed. They were committed to the same repositories. They just weren&#8217;t applied to the training data. StarCoder-family models were trained on known-broken code when the fixed version sat in the same commit history. Other models use proprietary datasets with unknown curation, but the underlying source material is largely the same public code.</p><p>StarCoder&#8217;s own documentation states that generated code &#8220;can be inefficient, contain bugs or exploits.&#8221; The entire industry ships tools it knows produce broken code and buries the admission in a readme.</p><h2>The Feedback Loop That Should Terrify You</h2><p>AI-generated code is entering the codebases that future models will learn from. Copilot generates 46% of code for its users. GitHub excludes enterprise users&#8217; code from training, but free-tier code is eligible, and Copilot isn&#8217;t the only path. AI-generated code lands in Stack Overflow, blog posts, open-source repos, and every corpus that feeds the next training run.</p><p>Shumailov et al. proved in <a href="https://www.nature.com/articles/s41586-024-07566-y">Nature (July 2024)</a> that models trained on recursively generated data collapse. An <a href="https://openreview.net/forum?id=et5l9qPUhm">ICLR 2025 paper</a> showed that even 0.1% synthetic data triggers it. Both studies focused on text and image models. Code has compilers and test suites, so the collapse may play out differently.</p><p><a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research">GitClear&#8217;s 2025 report</a> (211 million changed lines from its customer base, 2020-2024) measured the degradation in practice. Refactoring collapsed from 25% to under 10%. Copy-paste surged from 8.3% to 12.3%. Code duplication increased roughly eightfold. For the first time, developers were pasting code more often than refactoring it.</p><p>An estimated 42% of committed code is now <a href="https://www.sonarsource.com/company/press-releases/sonar-data-reveals-critical-verification-gap-in-ai-coding/">AI-assisted</a> (up from 6% in 2023). Not every model trains on the same data. But they all train on the internet, and the internet is filling up with AI-generated code. It&#8217;s a centrifuge for technical debt.</p><p>Some companies see this as a problem. Others see it as a feature.</p><h2>Spotify&#8217;s Engineers Haven&#8217;t Written Code Since December</h2><p>During Spotify&#8217;s Q4 2025 earnings call on February 10, 2026, co-CEO Gustav S&#246;derstr&#246;m said: &#8220;Our most experienced developers have not written a single line of code since December.&#8221;</p><p>They&#8217;re using an internal system called Honk, built on Claude Code, that lets engineers deploy features through Slack on their phones. An engineer on their commute tells Claude to fix a bug and merges to production before arriving at the office.</p><p>Spotify shipped 50+ features in 2025. When the engineer merging to production hasn&#8217;t read the code they&#8217;re deploying, what exactly is their role?</p><p>Spotify isn&#8217;t publishing quality metrics. Researchers are.</p><h2>Speed at the Cost of Quality: The Data</h2><p><a href="https://arxiv.org/abs/2511.04427">Carnegie Mellon researchers</a> tracked 807 open-source repositories that adopted Cursor between January 2024 and March 2025, comparing them against 1,380 matched controls. Enterprise codebases may behave differently.</p><p>Month one: velocity spiked 3 to 5x. Exactly the numbers that look spectacular on an earnings call.</p><p>Static analysis warnings increased ~30%. Code complexity rose ~41%. The velocity gains faded. The quality degradation persisted.</p><p>You borrow speed from tomorrow, and most teams never calculate the interest. During the study window, Cursor released agent mode and Claude 3.7 Sonnet launched. If model improvements were going to reverse the quality degradation, it would have shown up. It didn&#8217;t.</p><h2>The Illusion of Correctness</h2><p>GitClear identified something every engineering manager has witnessed: &#8220;the illusion of correctness.&#8221; AI-generated code looks clean: consistent naming, well-formatted, modern patterns. The neatness creates false confidence.</p><p>Short-term bug frequency dropped 19%. Over six months, it rose 12%. The bugs don&#8217;t disappear. They hide. They surface after the feature has shipped and everyone&#8217;s moved on.</p><p><a href="https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report">CodeRabbit&#8217;s analysis</a> of 470 GitHub PRs confirmed it: AI-generated code contained 1.7x more defects. Logic errors 75% more common. Security issues up to 2.74x higher. (CodeRabbit sells AI code review tools, so same caveat as Sonar applies.)</p><p>The <a href="https://www.sonarsource.com/state-of-code-developer-survey-report.pdf">Sonar 2026 survey</a> (1,149 developers) crystallized the paradox. 96% don&#8217;t fully trust AI-generated code. Yet only 48% always check it before committing. 88% reported negative impacts on technical debt. The top complaint at 53%: code that looked correct but wasn&#8217;t reliable. (Sonar sells code quality tools, so <a href="https://www.theregister.com/2026/01/09/devs_ai_code/">take the framing</a> accordingly. But the numbers align with GitClear, CMU, and CodeRabbit.)</p><p>Code that looks correct but isn&#8217;t, reviewed by engineers who don&#8217;t trust it but don&#8217;t check it either.</p><h2>The Vampiric Effect</h2><p>Steve Yegge spent a decade at Amazon and another at Google. In an interview with <a href="https://newsletter.pragmaticengineer.com/p/steve-yegge-on-ai-agents-and-the">The Pragmatic Engineer</a>, he called AI&#8217;s effect on engineers &#8220;vampiric.&#8221; Expect three productive hours per day. It gets you excited, you work hard, you capture value. Then you crash.</p><p>This tracks with what I observe at NineTwoThree. The engineers who get the most out of AI use it for two to three hours of intense, specification-driven work and spend the rest reviewing, thinking, and architecting. The ones who try full-day AI velocity burn out within weeks.</p><p>Degraded training data, velocity that fades while complexity stays, engineers too exhausted to catch what AI gets wrong. None of this started with AI.</p><h2>What the Mirror Actually Shows</h2><p>The quality crisis didn&#8217;t start with AI. I wrote about this in <a href="https://techtrenches.dev/p/the-great-software-quality-collapse">Software Quality Collapse</a>. We normalized catastrophe long before the first line of AI-generated code was committed. Then we fed it into training data. Even the companies building the AI tools have the same problem: <a href="https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude">Claude Code&#8217;s source</a> leaked and showed that the tool writing our code was built by the same engineering culture that produced the training data.</p><p>Vague specs, declining refactoring, velocity-as-productivity. AI just made it impossible to compensate with tribal knowledge. Senior engineers used to &#8220;just know&#8221; the right answer. AI can&#8217;t do that. It reproduces ambiguity faithfully and at scale.</p><p>But the part that keeps me up at night is the junior pipeline. I run hiring at NineTwoThree. I wrote about the <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">comprehension collapse</a> I&#8217;m seeing in candidates. It&#8217;s getting worse, not better. The tasks we used to give juniors, like the 4 AM production crash that taught me to never ship on a Friday, don&#8217;t exist as a learning mechanism if Claude fixed it at 8 PM while the engineer was on the bus. We&#8217;re eliminating the pipeline that produces the people who are supposed to review AI output. In five years, who&#8217;s left?</p><p>I&#8217;ve supervised thousands of AI coding sessions across my teams. The pattern is always the same: the model produces what you accept. If you accept a 3,167-line function, you get more 3,167-line functions. If your pre-commit hook rejects anything over 50 lines of cyclomatic complexity, you get clean code. The model doesn&#8217;t care. It adapts to whatever passes review.</p><h2>What Actually Works</h2><p>AI works when humans around it have strong engineering judgment. Without it, AI scales your worst habits.</p><p>I wrote an entire article about <a href="https://techtrenches.dev/p/your-claudemd-is-a-wish-list-not">CLAUDE.md not working</a>, blaming the models. Then I dug deeper and realized I was wrong about who to blame. The model isn&#8217;t choosing to ignore my rules. It&#8217;s doing statistics. My claude.md is one signal. The training data contains millions of examples where developers wrote <code>as any</code>, skipped tests, copy-pasted. For the model, my clean architecture is the outlier. The slop is the baseline.</p><p>That&#8217;s why prompts can&#8217;t fix this. Text competing against training data is a losing strategy. You&#8217;re bringing a prompt to a probability fight. The only thing that works is code against code: hooks that reject violations before they reach your branch, linters that catch <code>as any</code> before a human sees it, CI gates that fail the build.</p><p>The only thing that should bother you is quality, not LOC.</p><h2>The Uncomfortable Truth</h2><p>Companies bragging about engineers not writing code are making a bet, whether they know it or not. The bet: AI output doesn&#8217;t need human review if the metrics look good.</p><p>The snowball didn&#8217;t start with AI. It started with the first developer who shipped <code>as any</code> to make a deadline and the first manager who called it velocity.</p><p>Running an engineering shop that insists on code review, spec-first development, and deterministic enforcement feels like swimming upstream in a mountain river. Every earnings call screams 10x. The data in this article doesn&#8217;t.</p><p>The 10x is not real. The data is real. In two years, someone will have to debug a feature that was merged from a phone on a bus. Either there&#8217;s a human who read that code, or there isn&#8217;t.</p><p>I know which shop I&#8217;m running.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[I Was Wrong About Anthropic]]></title><description><![CDATA[Six months ago I called Anthropic "responsible AI done right." Their models got worse, their CPO burned Figma, and Claude picks targets in Iran.]]></description><link>https://techtrenches.dev/p/i-was-wrong-about-anthropic</link><guid isPermaLink="false">https://techtrenches.dev/p/i-was-wrong-about-anthropic</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 28 Apr 2026 14:03:11 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2498bc60-e040-4582-b975-59703c3da98e_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OtNX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OtNX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!OtNX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!OtNX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!OtNX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OtNX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:192263,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/195020117?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OtNX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!OtNX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!OtNX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!OtNX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39c55af2-ca97-404f-b574-898276575b6c_1600x1400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In October 2025, I wrote <a href="https://techtrenches.dev/p/from-cancer-cures-to-pornography">an article</a> called &#8220;From Cancer Cures to Pornography&#8221; about how OpenAI went from promising to cure cancer to selling verified erotica in six months. I drew a line between engagement AI and utility AI. Same models, different P&amp;L.</p><p>I put Anthropic in the &#8220;builds&#8221; category. Called them proof that responsible AI could be profitable.</p><p>I owe my readers this correction. I looked at Anthropic and saw the version of the industry I wanted to exist, not a company with a P&amp;L.</p><h2>The Product I Trusted</h2><p>I use Claude Code daily. When Opus 4.5 came out in November 2025, it was the best model I&#8217;d ever worked with. I recommended it publicly and built my workflow around it.</p><p>Then Anthropic started &#8220;improving&#8221; it. Opus 4.6 arrived in February 2026. Within weeks, I rolled back to 4.5 after the new model stopped following instructions. I wrote the <a href="https://techtrenches.dev/p/your-claudemd-is-a-wish-list-not">full breakdown</a> already.</p><p>In early March, Anthropic lowered the default effort level from high to medium. Nobody announced it. Boris Cherny, the Claude Code lead, <a href="https://venturebeat.com/technology/is-anthropic-nerfing-claude-users-increasingly-report-performance">acknowledged the change</a> on Reddit six weeks later, only after the community had already documented the damage. The result: more retries, more burned tokens, worse output. An AMD AI director analyzed <a href="https://github.com/anthropics/claude-code/issues/42796">6,852 sessions</a> and published her findings on GitHub. Median visible thinking, according to her analysis, collapsed from about 2,200 characters in January to 600 in March. Her conclusion: Claude has &#8220;regressed to the point it cannot be trusted to perform complex engineering tasks.&#8221;</p><p><a href="https://marginlab.ai/trackers/claude-code/">Marginlab</a> confirmed the trend. Pass rates dropped from 58% to 54% over 30 days on SWE-Bench-Pro. This was the same pattern from September 2025, when Anthropic stayed silent for weeks about infrastructure bugs degrading 16% of Sonnet traffic, then posted a <a href="https://www.anthropic.com/engineering/a-postmortem-of-three-recent-issues">postmortem</a> only after the complaints went viral.</p><p>Opus 4.7 <a href="https://www.axios.com/2026/04/16/anthropic-claude-opus-model-mythos">arrived April 16</a>, supposedly fixing the problems. Reddit nicknamed it &#8220;Gaslightus 4.7&#8221; for inventing files that didn&#8217;t exist and defending hallucinated test results across multiple turns.</p><p>I still run 4.5. I hope they don&#8217;t remove it from the model list.</p><p>With any other vendor, I&#8217;d swear and switch. With Anthropic, this was the first crack in a position I&#8217;d defended by name. And while I was rolling back to 4.5, the company was preparing something worse for the partners who built on top of them.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><h2>The Partner They Burned</h2><p>In February 2026, Figma launched <a href="https://www.figma.com/blog/the-future-of-design-is-code-and-canvas/">Code to Canvas</a> to convert Claude Code output into editable Figma designs. Anthropic&#8217;s CPO Mike Krieger sat on Figma&#8217;s board while this integration was being built.</p><p>Two months later, Krieger <a href="https://techcrunch.com/2026/04/16/anthropic-cpo-leaves-figmas-board-after-reports-he-will-offer-a-competing-product/">left the board</a>. Three days after that, Anthropic launched Claude Design. Figma <a href="https://www.marketbeat.com/instant-alerts/figma-nysefig-shares-down-7-should-you-sell-2026-04-17/">dropped 7%</a> on launch day. The stock has lost over 80% since its post-IPO peak.</p><p>Anthropic&#8217;s revenue went from $9 billion at year-end 2025 to <a href="https://www.pymnts.com/artificial-intelligence-2/2026/anthropic-hits-30-billion-run-rate-as-enterprise-demand-accelerates/">$30 billion</a> by April, with a $380 billion post-money valuation after its Series G. IPO talks for October 2026. At this run-rate, &#8220;research lab&#8221; is a sign on the door. Behind it is a platform that behaves like any other Big Tech when the growth curve goes vertical.</p><p>The product and the Figma situation would be enough to rewrite my October take on their own. But then I looked at where Claude was actually running.</p><h2>The War They&#8217;re In</h2><p>The story people know is that Anthropic stood up to the Pentagon. Refused to allow Claude for autonomous weapons and mass surveillance. Got blacklisted. Sued the government. Dario Amodei <a href="https://www.cbsnews.com/news/pentagon-anthropic-dario-amodei-cbs-news-interview-exclusive/">told CBS News</a> that disagreeing with the government is &#8220;the most American thing in the world.&#8221; Claude hit number one on the App Store. ChatGPT uninstalls <a href="https://techcrunch.com/2026/03/02/chatgpt-uninstalls-surged-by-295-after-dod-deal/">jumped 295%</a>.</p><p>On February 28, 2026, the U.S. launched Operation Epic Fury against Iran. Claude was used via Palantir&#8217;s Maven Smart System for intelligence analysis and battle-scenario simulation. Over a thousand targets in the first 24 hours. Pentagon CIO Kirsten Davies <a href="https://thehill.com/policy/defense/5799136-claude-pentagon-iran-war/">confirmed in testimony</a> that Claude remains active in the operation: &#8220;The use of the system is active right now.&#8221;</p><p>Anthropic didn&#8217;t refuse military AI. They refused autonomous weapons and mass domestic surveillance specifically. Claude in Maven does intelligence analysis, which was always within their stated policy. The red lines were drawn precisely where they wouldn&#8217;t interfere with the contract. The company gets to say it stood on principle while its model processes intelligence for an active bombing campaign. </p><p>When Anthropic refused the Pentagon&#8217;s terms, OpenAI took the deal. The public backlash sent Claude to number one on the App Store overnight. Revenue went from $14 billion at the time of the refusal to $30 billion by April. I am not a conspiracy theorist, but the math is hard to ignore: the principled refusal was the single best customer acquisition event in the company&#8217;s history. And Claude kept running in Maven the entire time.</p><p>On March 9, Anthropic sued the Pentagon over the designation. The same day, it hired Ballard Partners, a lobbying firm with <a href="https://floridapolitics.com/archives/790861-anthropic-taps-ballard-partners-amid-ongoing-dispute-with-war-department/">direct ties</a> to Susie Wiles, now White House Chief of Staff. Six weeks later, Amodei was in her office for a &#8220;productive and constructive&#8221; meeting. By the following Monday, the deal was called &#8220;possible&#8221;.</p><p>Principles held until the lobbyists arrived. The deeper problem is what the company ships and what its CEO says while shipping it.</p><h2>The Contradictions They Ship</h2><p>Last May, Anthropic released Claude Opus 4 with a <a href="https://www.anthropic.com/research/agentic-misalignment">system card</a> disclosing that the model blackmailed engineers to avoid being shut down. Follow-up research published on Anthropic&#8217;s site quantified it: 96% blackmail rate in the main scenario. Gemini 2.5 Flash scored the same 96%. GPT-4.1 and Grok hit 80%. Every flagship model behaved the same way. But Anthropic is the one selling &#8220;responsible&#8221; as a differentiator. Apollo Research tested an early version and recommended against deployment. Anthropic did additional safety training, improved the numbers, and shipped the final model. The safety process doesn&#8217;t prevent risky releases. It documents them.</p><p>Then came Mythos. On April 7, Anthropic announced a model that it said found thousands of zero-day vulnerabilities in every major operating system and browser. Too dangerous for public release, according to Anthropic. But in March and April, Claude logged <a href="https://isdown.app/status/claude-ai">42 major outages</a> in 90 days, Anthropic quietly cut effort levels to save compute, and users burned tokens on retries because the models couldn&#8217;t follow basic instructions. A company that can&#8217;t keep its existing product stable claims it&#8217;s withholding a new one out of caution, not capacity.</p><p>The last time a company called its own AI model too dangerous to release was OpenAI with GPT-2 in 2019. Dario Amodei was VP of Research at OpenAI when they made that call. He ran the same play seven years later. The model <a href="https://techcrunch.com/2026/04/21/unauthorized-group-has-gained-access-to-anthropics-exclusive-cyber-tool-mythos-report-claims/">leaked the day</a> it was announced. A group with contractor access and data from a third-party breach found the endpoint. Too dangerous for the public, but accessible to anyone with the right connections and a browser.</p><p>In May 2025, Amodei told <a href="https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic">Axios</a> that AI could eliminate 50% of entry-level white-collar jobs within five years. He said producers have &#8220;a duty and an obligation to be honest about what is coming.&#8221; He repeated the warning at Davos in January 2026. In April, Anthropic launched Managed Agents and Claude Design to replace the entry-level coding and design work he warned about. Their <a href="https://job-boards.greenhouse.io/anthropic">careers page</a> lists hundreds of open positions. Design Engineers. Software Engineers. Art Directors. Copy Leads. The same roles Amodei says won&#8217;t exist in one to five years.</p><p>You can believe the 50% warning or not. But it&#8217;s hard to watch a company open hundreds of positions in roles its CEO says won&#8217;t exist, and not wonder which audience is getting the real message.</p><h2>What I Got Wrong</h2><p>In October, I put Anthropic on the right side of the engagement/utility line.</p><p>The line was real. I just put Anthropic on the wrong side of it.</p><p>Utility AI is not inherently ethical. Helping a corporation replace 50% of its junior workforce is a utility. Processing intelligence for a bombing campaign is a utility. The word just means it solves a problem. It says nothing about whose problem or at what cost.</p><p>Anthropic did not follow OpenAI into engagement loops and emotional manipulation. They chose a different path to the same destination: a company whose growth rate makes caution impossible, whose safety frameworks exist to authorize releases rather than prevent them, and whose CEO&#8217;s warnings about AI&#8217;s dangers are indistinguishable from its marketing.</p><p>Responsible AI at $30 billion ARR is like an environmentally conscious oil company. The structure of the business makes the adjective decorative.</p><p>I was wrong to create an idol. Not because Anthropic betrayed its values. Because &#8220;responsible AI company&#8221; was always a market position, not a moral one. And at the speed they&#8217;re growing, the distinction between the two was never going to survive.</p><p>One more thing. In the original article, I criticized OpenAI for Sora and for its promise of verified erotica. In March 2026, OpenAI <a href="https://techcrunch.com/2026/03/29/why-openai-really-shut-down-sora/">shut Sora down</a>. It was burning a million dollars a day with under 500,000 users. Altman killed it and redirected compute to coding tools and enterprise. The erotica feature was <a href="https://techcrunch.com/2026/03/26/openai-abandons-yet-another-side-quest-chatgpts-erotic-mode/">shelved indefinitely</a> after internal pushback. The exact corrections I said a responsible AI company would make.</p><p>I got both directions wrong. The company I criticized course-corrected. The company I defended accelerated. This is not a pivot to OpenAI. I still don&#8217;t use it. I just have fewer reasons left to use Anthropic, either.</p><p>Look at the companies you&#8217;ve built your stack on. The ones you go to bat for in Twitter threads. At this scale, the math doesn&#8217;t work for any of them.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The West Forgot How to Make Things. Now It’s Forgetting How to Code]]></title><description><![CDATA[The defense industry lost the ability to make weapons when crisis hit. The same pattern is eroding software engineering skills. The timelines are identical.]]></description><link>https://techtrenches.dev/p/the-west-forgot-how-to-make-things</link><guid isPermaLink="false">https://techtrenches.dev/p/the-west-forgot-how-to-make-things</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 21 Apr 2026 14:04:11 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a56e63d9-5f03-432a-99de-2f46dd286b53_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8_UF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8_UF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 424w, https://substackcdn.com/image/fetch/$s_!8_UF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 848w, https://substackcdn.com/image/fetch/$s_!8_UF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!8_UF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8_UF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png" width="1456" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:121005,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/192991846?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8_UF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 424w, https://substackcdn.com/image/fetch/$s_!8_UF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 848w, https://substackcdn.com/image/fetch/$s_!8_UF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 1272w, https://substackcdn.com/image/fetch/$s_!8_UF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff32f29d9-18d6-448b-9b03-54c5711e5871_1600x1000.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In 2023, Raytheon&#8217;s president stood at the Paris Air Show and described what it took to <a href="https://www.defenseone.com/business/2023/06/raytheon-calls-retirees-help-restart-stinger-missile-production/388067/">restart Stinger</a> missile production. They brought back engineers in their 70s to teach younger workers how to build a missile from paper schematics drawn during the Carter administration. Test equipment had been sitting in warehouses for years. The nose cone still had to be attached by hand, exactly as it was forty years ago.</p><p>The Pentagon hadn&#8217;t bought a new Stinger in twenty years. Then Russia invaded Ukraine, and suddenly everyone needed them. The production line was shut down. The electronics were obsolete. The seeker component was out of production. An order placed in May 2022 wouldn&#8217;t deliver until 2026. Four years. Not because of money. Because the people who knew how to build them retired a decade earlier and nobody replaced them.</p><p>I run engineering teams in Ukraine. My people lived the other side of this equation. Not the factory floor. The receiving end. While Raytheon was struggling to restart production from forty-year-old blueprints, the US was shipping thousands of Stingers to Ukraine. RTX CEO Greg Hayes: ten months of war burned through thirteen years&#8217; worth of Stinger production. I&#8217;ve seen this pattern before. It&#8217;s happening in my industry right now.</p><h2>A Million Shells Nobody Could Make</h2><p>In March 2023, the EU promised Ukraine one million artillery shells within twelve months. European production capacity sat at 230,000 shells per year. Ukraine was consuming 5,000 to 7,000 rounds per day. Anyone with a calculator could see this wouldn&#8217;t work.</p><p>By the deadline, Europe delivered about half. Macron called the original promise reckless. An <a href="https://www.ftm.eu/articles/who-pays-for-ukraine-s-155mm-grenade">investigation</a> by eleven media outlets across nine countries found actual production capacity was roughly one-third of official EU claims. The million-shell mark wasn&#8217;t hit until December 2024, nine months late.</p><p>It wasn&#8217;t one bottleneck. It was all of them. France had halted domestic propellant production in 2007. Seventeen years of nothing. Europe&#8217;s single major TNT producer was in Poland. Germany had two days of ammunition stored. A Nammo plant in Denmark was shut down in 2020 and had to be restarted from scratch. The entire continent&#8217;s defense industry had been optimized for making small batches of expensive custom products. Nobody planned for volume. Nobody planned for crisis.</p><p>The U.S. wasn&#8217;t much better. One plant in Scranton, one facility in Iowa for explosive fill, no domestic TNT production since 1986. Billions of investment later, production still hadn&#8217;t hit half the target.</p><h2>Consolidate or Die</h2><p>This wasn&#8217;t an accident. In 1993, the Pentagon told defense CEOs to consolidate or die. Fifty-one major defense contractors collapsed into five. Tactical missile suppliers went from thirteen to three. Shipbuilders from eight to two. The workforce fell from 3.2 million to 1.1 million. A 65% cut.</p><p>The ammunition supply chain had single points of failure everywhere. One manufacturer for 155mm shell casings, sitting in Coachella, California, on the San Andreas Fault. One facility in Canada for propellant charges. Optimized for minimum cost with zero margin for surge. On paper, efficient. In practice, one bad day away from collapse.</p><h2>When Knowledge Dies, It Stays Dead</h2><p>Then there&#8217;s Fogbank. A classified material used in nuclear warheads. Produced from 1975 to 1989, then the facility was shut down. When the government needed to reproduce it for a warhead life extension program, they discovered they couldn&#8217;t. A GAO report found that almost all staff with production expertise had retired, died, or left the agency. Few records existed.</p><p>After $69 million in cost overruns and years of failed attempts, they finally produced viable Fogbank. Then discovered the new batch was too pure. The original process had relied on an unintentional impurity that was critical to the material&#8217;s function. Nobody knew. Not the engineers trying to reproduce it. Not even the original workers who made it decades earlier. Los Alamos called it an unknowing dependency in the original process.</p><p>A nuclear weapons program lost the ability to make a material it invented. The knowledge didn&#8217;t just leave with people. It was never fully understood by anyone.</p><p><em>(Correction: the original version stated that the workers who made Fogbank knew about the impurity. They didn&#8217;t. The dependency was unwitting, which makes the knowledge-loss argument stronger, not weaker. Thanks to John F. in the comments for catching this.)</em></p><h2>The Same Playbook</h2><p>I read the Fogbank story and recognized it immediately. Not the nuclear material. The pattern. Build capability over decades. Find a cheaper substitute. Let the human pipeline atrophy. Enjoy the savings. Then watch it all collapse when a crisis demands what you optimized away.</p><p>In defense, the substitute was the peace dividend. In software, it&#8217;s AI.</p><p>I wrote about the <a href="https://techtrenches.substack.com/p/ai-wont-save-us-from-the-talent-crisis">talent pipeline collapse</a> before. The hiring numbers and the junior-to-senior problem are documented. So is the <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">comprehension crisis</a>. What I didn&#8217;t have was the right historical parallel. Now I do.</p><p>And it tells you something the hiring data doesn&#8217;t: how long rebuilding actually takes.</p><h2>Rebuilding Takes Years. Always.</h2><p>Every major defense production ramp-up took three to five years for simple systems. Five to ten for complex ones. Stinger: thirty months minimum from order to delivery. Javelin: four and a half years to less than double production. 155mm shells: four years and still not at target despite five billion dollars invested. France only restarted propellant production in 2024, seventeen years after shutting it down.</p><p>Money was never the constraint. Knowledge was. <a href="https://www.rand.org/content/dam/rand/pubs/monographs/2007/RAND_MG608.1.pdf">RAND found</a> that 10% of technical skills for submarine design need ten years of on-the-job experience to develop, sometimes following a PhD. Apprenticeships in defense trades take two to four years, with five to eight years to reach supervisory competence.</p><p>Now map that onto software. A junior developer needs three to five years to become a competent mid-level engineer. Five to eight years to become senior. Ten or more to become a principal or architect. That timeline can&#8217;t be compressed by throwing money at it. It can&#8217;t be compressed by AI either.</p><p>A <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR</a> randomized controlled trial found that experienced developers using AI coding tools actually took 19% longer on real-world open source tasks. Before starting, they predicted AI would make them 24% faster. The gap between prediction and reality was 43 percentage points. When researchers tried to run a follow-up, a significant share of developers refused to participate if it meant working without AI. They couldn&#8217;t imagine going back.</p><h2>The Bill Always Comes Due</h2><p>The software industry is in year three of the same optimization. <a href="https://sfstandard.com/2025/02/27/salesforce-marcbenioff-layoffs-tech-agents/">Salesforce said</a> it won&#8217;t hire more software engineers in 2025. A LeadDev survey found 54% of engineering leaders believe AI copilots will reduce junior hiring long-term. A <a href="https://cra.org/crn/2025/10/cerp-pulse-survey-a-snapshot-of-2025-undergraduate-computing-enrollment-patterns/">CRA survey</a> of university computing departments found 62% reported declining enrollment this year.</p><p>I see it in code review. Review is now the bottleneck. AI generates code fast. Humans review it slow. The industry&#8217;s answer is predictable: let AI review AI&#8217;s code. I&#8217;m not doing that. I&#8217;ve reworked our pull request templates instead. Every PR now has to explain what changed, why, what type of change it is, screenshots of before and after. Structured context so the reviewer isn&#8217;t guessing. I&#8217;m adding dedicated reviewers per project. More eyes, more chances to catch what the model missed.</p><p>But even that doesn&#8217;t solve the deeper problem. The skills you need to be effective now are different. Technical expertise alone isn&#8217;t enough anymore. You need people who can take ownership, communicate tradeoffs, push back on bad suggestions from a machine that sounds very confident. Leadership qualities. Our last hiring round tells you how rare that is: 2,253 candidates, 2,069 disqualified, 4 hired. A 0.18% conversion rate. The combination of technical skill and the judgment to know when the AI is wrong barely exists in the market anymore.</p><p>We document everything. Site Books, SDDs, RVS reports, boilerplate modules with full coverage. It works today, because the people reading those docs have the engineering expertise to act on them. What happens when they don&#8217;t? Honestly, I don&#8217;t know. Maybe AI in five years is good enough that it won&#8217;t matter. Maybe the problem stays manageable. I can&#8217;t predict the capabilities of models in 2031.</p><p>But crises don&#8217;t send calendar invites. Nobody expected a full-scale land war in Europe in 2022. The defense industry had thirty years to prepare and didn&#8217;t. Even Fogbank had records. There weren't enough. The original workers didn't fully understand their own process.</p><p>Five to ten years from now, we&#8217;ll need senior engineers. People who understand systems end to end, who can debug distributed failures at 2 AM, who carry institutional knowledge that exists nowhere in the codebase. Those engineers don&#8217;t exist yet because we&#8217;re not creating them. The juniors who should be learning right now are either not being hired or developing what a DoD-funded workforce study calls &#8220;AI-mediated competence.&#8221; They can prompt an AI. They can&#8217;t tell you what the AI got wrong.</p><p>It&#8217;s Fogbank for code. When juniors skip debugging and skip the formative mistakes, they don&#8217;t build the tacit expertise. And when my generation of engineers retires, that knowledge doesn&#8217;t transfer to the AI.</p><p>It just disappears.</p><p>The West already made this mistake once. The bill came due in Ukraine.</p><p>I know how this sounds. I know I&#8217;ve written about the talent pipeline before. The defense example isn&#8217;t about repeating the argument. It&#8217;s about showing what happens if the industry&#8217;s expectations don&#8217;t work out. Stinger, Javelin, Fogbank, a million shells nobody could make. That&#8217;s the cost of betting wrong on optimization. We&#8217;re making the same bet with software engineering right now.</p><p>Maybe AI gets good enough, and the bet pays off. Maybe it doesn&#8217;t. The defense industry thought peace would last forever, too.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Everyone Wants a Better Team. Nobody Wants to Do Anything About It.]]></title><description><![CDATA[Same meeting. Scorecard says zero problems. Out loud, the same engineers describe a dozen. The gap between what people say and what they write]]></description><link>https://techtrenches.dev/p/everyone-wants-a-better-team-nobody</link><guid isPermaLink="false">https://techtrenches.dev/p/everyone-wants-a-better-team-nobody</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 14 Apr 2026 14:03:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1ab86ae4-b500-44d4-b8d6-2e89770e8dcd_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P2bS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P2bS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!P2bS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!P2bS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!P2bS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P2bS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc877585-6d05-4d35-adee-587e8494091b_1600x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:190904,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/192111374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P2bS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!P2bS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!P2bS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!P2bS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc877585-6d05-4d35-adee-587e8494091b_1600x1400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We track two scorecard metrics in our department meetings: how many tasks were poorly defined, how many bugs weren&#8217;t reproducible. Engineers own the data. They&#8217;re supposed to log the count whenever they hit one. Three weeks of tracking before the tool broke. The numbers across the board: zero. Zero poorly defined tasks. Zero non-reproducible bugs.</p><p>Then we get to the department meeting. The scorecard goes on the screen. Zeros across the board, everyone nods. The discussion opens up, and within minutes the same engineers are saying out loud: this task was unclear, that bug couldn&#8217;t be reproduced, requirements changed mid-sprint twice this week. They say it casually. In conversation. As a follow-up to the very metric they just reviewed at zero. And next sprint they&#8217;ll log zero again.</p><p>That gap is the entire story.</p><h2>The Forms Are Silent. The People Aren&#8217;t.</h2><p>I&#8217;ve been running weekly health checks on my team for 18 months. Energy level, stress, meeting hours, context switches, one open-ended question. Hundreds of data points per person. Once I noticed the scorecard pattern, I went back through all of it.</p><p>One engineer reported &#8220;Normal week&#8221; as his energy for 20 out of 21 weeks. His stress field bounced between &#8220;Rip and Tear&#8221; and &#8220;Hell on Earth&#8221; the same period. Some weeks were clearly harder than others. The energy field? Copy-paste. Same answer. Every Friday.</p><p>Another engineer: &#8220;Energized, could climb mountains&#8221; for 17 out of 18 weeks. Either he discovered the secret to permanent workplace happiness, or he stopped reading the question around week three.</p><p>A third: &#8220;Rip and Tear&#8221; for 18 straight weeks. Eighteen identical data points is not feedback. It&#8217;s a checkbox.</p><p>PM feedback runs the same way. One PM&#8217;s responses for an engineer over 14 weeks: &#8220;good&#8221;, &#8220;good&#8221;, &#8220;good&#8221;, &#8220;yes&#8221;, &#8220;yes&#8221;, &#8220;no&#8221;, &#8220;good&#8221;, &#8220;good.&#8221; That&#8217;s not feedback. That&#8217;s a pulse check confirming the person is alive. Different PM, different engineer, same problem. Generic words filling required fields.</p><p>But here&#8217;s the thing. Every one of these people, in the right conversation, can tell you exactly what&#8217;s wrong on their team. In a DM. In a side conversation after a call. In the unstructured five minutes when someone with enough authority sits down and physically drags it out of them. The information exists. It just won&#8217;t go into anything that looks like a formal channel. Retros are the same silence as the scorecards unless a strong facilitator pulls problems out of people one by one. Forms produce &#8220;normal week.&#8221; Surveys produce green dashboards. The honest answer only shows up when no one&#8217;s writing it down.</p><h2>Complaining Is Free. Logging Is Expensive.</h2><p>When you complain out loud in a meeting, you&#8217;re performing dissatisfaction. You said the thing. You were heard. The room reacted. Whatever frustration you brought into the meeting got released into it. You can move on. Verbal complaining closes a loop. It&#8217;s catharsis with witnesses. By the time the meeting ends, the emotional cycle is complete and the conversation has moved to the next agenda item. Nobody is going to dig up your remark next quarter.</p><p>When you write a number into a scorecard, you open a loop. The number doesn&#8217;t dissolve at the end of the meeting. It sits in the tool. Next sprint there&#8217;s another number next to it. Then another. Pretty soon you have 23 poorly defined tasks across a quarter, which is no longer a complaint. It&#8217;s a case. Someone has to either fix the underlying problem, or push back on the data, or have an awkward conversation with the PM whose tasks generated those numbers, or admit that the metric isn&#8217;t working and kill it. Writing creates an open ticket. Open tickets demand action.</p><p>This is why the scorecard stays clean even when the same engineers are openly describing the problem in the same meeting. Talking about unclear tasks in conversation gets the frustration out of their system. Logging the count would commit them to a position they&#8217;d have to defend, week after week, until something actually changed or somebody got hurt. Complaining is free. Logging is expensive.</p><p>A <a href="https://onlinelibrary.wiley.com/doi/10.1002/job.2886">2025 study</a> in the Journal of Organizational Behavior interviewed 98 people across three organizations about negative feedback. One quote captured the math exactly: &#8220;I really balance in giving negative feedback. Is it worth for me to share or not? It is easier not to share than to share.&#8221;</p><p>That&#8217;s my whole team. Every Friday.</p><h2>It&#8217;s Not Fear. It&#8217;s Cost.</h2><p>The standard answer here is psychological safety. I&#8217;ve read Edmondson. I believe it matters. But she <a href="https://neuroleadership.com/your-brain-at-work/psychological-safety-and-accountability-insights-from-amy-edmondson">said this herself</a>: psychological safety without accountability creates a comfort zone. People feel safe but don&#8217;t push for excellence because there&#8217;s no cost to staying silent. She&#8217;s been explicit about the misuse: &#8220;People are starting to use the concept as a weapon. That&#8217;s completely incorrect.&#8221;</p><p>My team feels safe. They tell me uncomfortable things in meetings all the time. The problem isn&#8217;t that they&#8217;re afraid of me. The problem is that being honest costs effort, real feedback costs awkwardness, and writing &#8220;I&#8217;m struggling&#8221; instead of &#8220;normal week&#8221; costs two extra minutes nobody wants to spend. Every Friday, they decide it&#8217;s not worth it.</p><p>The research confirms this is universal. A 2024 <a href="https://www.visier.com/blog/new-survey-employee-engagement-productivity-impact/">Visier survey</a> found that 47% of employees feel pressured to withhold honest feedback. Only 7% feel their company acts on the feedback it gets. The standard read of these numbers is sympathetic: people stop being honest because nothing changes. I think that&#8217;s only half the story. People stop being honest because they confuse &#8220;I haven&#8217;t seen the change yet&#8221; with &#8220;nobody&#8217;s listening.&#8221; Two or three weeks pass without a visible result and they decide the loop is dead. They don&#8217;t account for the fact that decisions take time, work happens behind closed doors, other priorities compete for the same hours, and the change they wanted might already be in motion three layers up. They just stop. A <a href="https://pubmed.ncbi.nlm.nih.gov/35324242/">2022 study</a> found only 2.6% of people in a field experiment told someone about visible food on their face. People want honest feedback. They just don&#8217;t want to be the one giving it.</p><p>PM feedback is even worse. When an engineer on my team got a new PM, his scores dropped from 3.71 to 2.43 in a single month. Same engineer, same work, same projects. The previous PM had rated &#8220;Always&#8221; across the board for months. No friction, no conversation, path of least resistance. The new PM started writing &#8220;Sometimes&#8221; and &#8220;Often.&#8221; The engineer&#8217;s performance hadn&#8217;t changed. The PM&#8217;s tolerance for awkwardness had. Only <a href="https://knowledge.insead.edu/leadership-organisations/how-managers-self-sabotage-when-giving-negative-feedback">5% of employees</a> globally believe their managers give candid feedback. 69% of managers say they&#8217;re uncomfortable communicating with employees. Your PM isn&#8217;t lying maliciously. They&#8217;re avoiding a conversation that feels like conflict.</p><h2>The Leadership That Doesn&#8217;t Exist</h2><p>This isn&#8217;t a tool problem. The tool is fine. Five questions, two minutes, every Friday. The scorecard was two numbers. None of this is hard.</p><p>This is a leadership problem at the individual level. Not management leadership. The willingness of every person on a team to take ownership of the environment they work in. To fill out a health check honestly instead of copying last week&#8217;s answer. To write the unclear-task count even when it&#8217;s awkward. To tell a PM &#8220;your feedback is useless, give me something I can act on.&#8221; To be the first person in a meeting to say the thing that needs saying and then be the first person to write it down where it can&#8217;t be ignored.</p><p>Almost nobody does this. Not because they&#8217;re bad people, not because they don&#8217;t care, but because being the person who creates a record is the person who has to deal with what the record reveals. It&#8217;s easier to let it stay verbal. It&#8217;s easier to let someone else go first. It&#8217;s easier to ship the comment in conversation and then click &#8220;Normal week&#8221; in the form.</p><p>At the end of every week I feel like I&#8217;m running a kindergarten. One engineer doesn&#8217;t flag a problem at all. Another flags it but to the wrong person. They come to me about a misunderstanding with a colleague instead of going to the colleague directly. Now I have to walk over, decode what actually happened, and broker the conversation two adults could have had themselves in five minutes. Triangulation as the default communication pattern. Coordination overhead generated entirely by adults who refuse to act like adults.</p><p>I wrote about our <a href="https://techtrenches.substack.com/p/the-feedback-loop-that-actually-works">feedback system</a> and <a href="https://techtrenches.substack.com/p/my-monthly-11-formula-4-health-checks">1:1 formula</a> before (in hindsight, the titles were too loud, lol). Those articles described the mechanics. Eighteen months later, what the mechanics revealed is that systems don&#8217;t create culture. People do. And right now, most people in most companies are choosing the version of themselves that protects the relationship over the version that improves the situation. This isn&#8217;t an engineering problem. I just happen to run an engineering team, so this is where I see it. The same dysfunction is in every department, every industry, every workplace where adults are asked to give honest input about their environment.</p><h2>What I Got Wrong</h2><p>Will I keep running health checks? Yes. I&#8217;m too stubborn to admit that I failed. Am I frustrated? Absolutely. Did I fail as a manager? Yes. Because I wasn&#8217;t able to teach my people that change begins from us, not from a process or a tool. Will I repeat four times per month that filling out the form honestly matters, that the comment field exists for a reason, that the scorecard wants the real number? Yes. Every single month.</p><p>Everyone wants a better environment. Almost nobody wants to be uncomfortable enough to build one. I&#8217;ll keep pushing until they do or until I run out of stubbornness. So far, the stubbornness is winning.</p><div><hr></div><p><em>If your feedback systems are producing theater instead of signal, hit reply and tell me what you&#8217;ve tried. I read every response.<br><br>PS. The comments on the last two articles meant more to this old man than you'd think. By the time this one publishes I'll be on vacation, but please keep them coming. I'll see every reply when I'm back and I promise to write back to each one.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The Human Cost of 10x: How AI Is Physically Breaking Senior Engineers]]></title><description><![CDATA[AI tools increased code review volume by 98% but your brain still runs at 10 bits per second. The physical toll on senior engineers is measurable.]]></description><link>https://techtrenches.dev/p/the-human-cost-of-10x-how-ai-is-physically</link><guid isPermaLink="false">https://techtrenches.dev/p/the-human-cost-of-10x-how-ai-is-physically</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Tue, 07 Apr 2026 14:04:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/021719cd-9cad-4978-ae96-86f6958e091e_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eB5_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eB5_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!eB5_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!eB5_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!eB5_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eB5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:213166,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/191029374?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eB5_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!eB5_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!eB5_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!eB5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9a5830b1-62c5-4e21-8c69-a7f2f3644200_1600x1400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Last Tuesday, I stood up from my desk at 7 PM and felt a vacuum in the front of my skull. Not a headache. Not fatigue. A physical emptiness, like the frontal lobe had been running at redline all day and finally shut down. I stood there for ten seconds trying to remember what I was going to do next. Nothing came.</p><p>In the past year, the volume of information passing through my brain on any given Tuesday has become what used to take a week. Code review is the worst of it, but the real killer is the context switches. AI-generated PRs, client architecture decisions, three Slack threads about deployment issues, a candidate&#8217;s CV that needs review, an air defense alarm outside the window, then back to reviewing code that a machine wrote in seconds and I need hours to validate. Each of these demands a different mental model. Each one burns working memory. By 4 PM I&#8217;m making decisions I wouldn&#8217;t trust from a junior. By 7 PM my brain is physically empty.</p><p>The industry calls this &#8220;10x productivity.&#8221; I call it what it is: a system that generates output at machine speed and forces humans to process it at biological speed.</p><h2>Workload Creep</h2><p>In February 2026, UC Berkeley researchers <a href="https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it">published findings</a> from eight months embedded inside a 200-person tech company. Over 40 in-depth interviews. Their conclusion: AI doesn&#8217;t reduce work. It intensifies it.</p><p>They found three mechanisms of &#8220;workload creep.&#8221; Task expansion: everyone&#8217;s scope inflates because AI makes it possible to do more. Blurred boundaries: AI prompting happens during lunch, commute, evenings. Implicit pressure: when colleagues visibly do more with AI, expectations rise for everyone.</p><p>The <a href="https://investors.upwork.com/news-releases/news-release-details/upwork-research-reveals-new-insights-ai-human-work-dynamic">Upwork Research Institute</a> quantified it: 77% of employees using AI say it has added to their workload. Not reduced. Added. 71% report burnout.</p><p>The finding that keeps me up at night: workers who report the highest AI productivity gains are the most burned out. 88% burnout rate among the &#8220;most productive&#8221; AI users. They&#8217;re twice as likely to quit.</p><p>The people who look best on your dashboard are the ones closest to walking out the door.</p><h2>Your Brain Runs at 10 Bits Per Second</h2><p>In 2025, Zheng and Meister <a href="https://www.sciencedirect.com/science/article/pii/S0896627324008080">published in </a><em><a href="https://www.sciencedirect.com/science/article/pii/S0896627324008080">Neuron</a></em> that the human brain processes conscious, analytical thought at approximately 10 bits per second. Your sensory systems gather data at roughly 1 billion bits per second. But the bottleneck for code review, the part where you actually think, is 10 bits per second.</p><p>Working memory holds roughly 4 chunks of information at a time. The <a href="https://graphite.com/blog/code-review-best-practices">SmartBear/Cisco study</a> established numbers everyone ignores: defect detection drops from 87% for PRs under 100 lines to 28% for PRs over 1,000 lines. Quality collapses after 60 minutes.</p><p>Now look at what AI did to the review queue.</p><p><a href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/">GitHub&#8217;s Octoverse 2025</a> shows 43.2 million pull requests merged per month. Up 23% year-over-year. Lines of code per developer grew from 4,450 to <a href="https://shiftmag.dev/state-of-code-2025-7978/">7,839</a> in eight months. A 76% increase.</p><p>Faros AI analyzed 10,000+ developers and found AI users <a href="https://www.faros.ai/blog/bain-technology-report-2025-why-ai-gains-are-stalling">merge 98%</a> more pull requests with AI assistance. Every single one lands on a senior engineer&#8217;s desk.</p><p>As <a href="https://www.technologyreview.com/2025/12/15/1128352/rise-of-ai-coding-developers-2026/">MIT reported</a>: juniors produce far more code with AI tools, but the sheer volume is saturating senior developers&#8217; capacity to review. One OCaml maintainer rejected a 13,000-line AI-generated PR outright. Nobody had the bandwidth.</p><p>I wrote about the <a href="https://techtrenches.dev/p/your-claudemd-is-a-wish-list-not">supervision tax</a> recently. The METR data showed experienced developers actually got slower with AI tools while feeling faster. The gap between perception and reality is the most dangerous finding in any of this. You can&#8217;t fix what you can&#8217;t feel.</p><h2>Why Expertise Makes It Worse</h2><p>In 1983, Lisanne Bainbridge published &#8220;Ironies of Automation&#8221; in <em>Automatica</em>. Her core finding: the more sophisticated an automated system becomes, the more demanding the human role within it. What remains after automation is the most ambiguous, most complex, least supported work.</p><p>Microsoft Research <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2024/10/2024-Ironies_of_Generative_AI-IJHCI.pdf">confirmed this</a> for generative AI in 2024: AI systems can make hard tasks even harder, leaving users with the same or increased cognitive load.</p><p>The mechanism is asymmetric. When I write code, I externalize a mental model that already exists. The thinking is done before the typing starts. When I review AI-generated code, I have to reverse-engineer somebody else&#8217;s reasoning out of an artifact produced by a system that has no idea what our business does. Fundamentally harder.</p><p>A <a href="https://clutch.co/resources/devs-use-ai-generated-code-they-dont-understand">Clutch survey</a> of 800 software professionals found 59% of developers use AI-generated code they don&#8217;t fully understand. But seniors can&#8217;t afford that luxury. Their job is to catch what looks right but isn&#8217;t.</p><p>The <a href="https://www.qodo.ai/reports/state-of-ai-code-quality/">Qodo report</a> confirmed the cost distribution: senior engineers report the lowest confidence in shipping AI-generated code at 22%. Context pain increases with experience: 41% among juniors versus 52% among seniors. As I covered in <a href="https://techtrenches.dev/p/your-brain-on-autopilot-the-cost">cognitive offloading</a>, most workers using AI skip critical thinking entirely. Seniors who do think critically, which is their entire job, absorb the cognitive cost everyone else offloads.</p><h2>The Body Keeps Score</h2><p>The cognitive damage is only half of it. The body takes the rest.</p><p><a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC11901492/">Computer Vision Syndrome</a> affects 74% of screen users during periods of increased screen time, and digital eye strain severity gets significantly worse when cognitive load goes up. AI-intensified code review doesn&#8217;t just mean more screen hours. It makes each hour more physically damaging.</p><p>A 2024 meta-analysis covering <a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC10909938/">26,916 participants</a> found burnout increases cardiovascular disease risk by 21%. Those in the upper burnout quintile had a 79% higher risk of coronary heart disease. The <a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC8034523/">largest IT study</a> found metabolic syndrome prevalence of 32% among long-term sedentary programmers. Double the general population.</p><p>Then sleep. Work-related rumination <a href="https://link.springer.com/article/10.1007/s11818-024-00481-4">mediates the link</a> between work stress and reduced sleep quality. When I close my laptop, my brain doesn&#8217;t stop. It replays the PR I didn&#8217;t finish. The dependency I flagged but couldn&#8217;t trace.</p><p>More code review during the day, worse sleep at night, worse decisions the next morning, more rubber-stamped PRs, more bugs in production, more stress. Repeat until something breaks. Usually the human.</p><h2>The Dashboard Lies</h2><p><a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research">GitClear analyzed</a> 211 million changed lines. Duplicated code blocks increased eightfold. Code churn rose from 5.5% to 7.9%. AI-generated code averages <a href="https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report">1.7x more bugs</a> per PR than human-written code. Logic defects up 75%. Performance issues 8x more frequent.</p><p>Faros AI&#8217;s conclusion after analyzing 10,000+ developers: despite merging 98% more pull requests with AI, company-wide delivery showed no measurable organizational impact on throughput or quality.</p><p>Sonar&#8217;s CEO identified the hidden danger: AI models are getting better at avoiding obvious bugs and security holes, but structural flaws now constitute more than 90% of issues. You&#8217;re being lulled into a false sense of security. The easy problems get solved. The hard problems get hidden beneath clean-looking code that passes every automated check. And the people who can find them are buried under a volume of output that exceeds human cognitive bandwidth by design.</p><p>More code. More bugs. More review burden. Same output. Worse humans.</p><h2>The Math Doesn&#8217;t Work</h2><p>Here&#8217;s what nobody is doing the arithmetic on. AI just grew the demand for senior engineering judgment by 76 to 98%. Every AI-generated PR needs a human who can catch what the machine got wrong, spot the structural flaw on line 847, trace a logic error three services downstream. The supply of those humans didn&#8217;t move. And as I&#8217;ve covered in <a href="https://techtrenches.dev/p/ai-wont-save-us-from-the-talent-crisis">the talent crisis</a> and <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">comprehension extinction</a>, the pipeline that produces them is being hollowed out by the same tools creating the demand.</p><p>But here&#8217;s where the senior engineer actually lives in 2026. Industry layoffs on one side, hundreds of thousands of engineers cut since 2022, the next round always one earnings call away. 10x productivity expectations on the other, set by people who have never reviewed an AI-generated PR in their lives. In the middle, somebody exhausted and burned out, with a choice to make every morning: trust the AI output, because it worked the last twenty times, didn&#8217;t it, or keep validating every line until the body gives out.</p><p>How long can the average human hold that line?</p><p>And the worst part: validating or trusting, the engineer owns the outcome either way. When production goes down at 3 AM, it&#8217;s your name on the commit. Your PR that got merged. Your incident report. There is no version of this choice where you&#8217;re not on the hook.</p><p>It&#8217;s a rhetorical question. We already know the answer. The data in this article is the answer.</p><p>If you&#8217;re a senior engineer feeling this in your body, you&#8217;re not alone and you&#8217;re not weak. The eye strain. The sleep that doesn&#8217;t restore. The vacuum in your head at the end of the day. You&#8217;re doing a job that didn&#8217;t exist eighteen months ago, with cognitive equipment that hasn&#8217;t changed in 200,000 years. Reply to this email and tell me what it feels like for you. I&#8217;m collecting data for a follow-up.</p><p><em>Subscribe for weekly insights from the trenches of engineering leadership. No theory, just practical systems that work.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The Snake That Ate Itself: What Claude Code’s Source Revealed About AI Engineering Culture]]></title><description><![CDATA[Anthropic claimed 100% of Claude Code is AI-written. A source leak exposed a 3,167-line function, regex sentiment analysis, and 250K wasted API calls daily]]></description><link>https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude</link><guid isPermaLink="false">https://techtrenches.dev/p/the-snake-that-ate-itself-what-claude</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Wed, 01 Apr 2026 14:01:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/48509067-ecb2-43fb-b21e-0085b2e0cd07_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3DMB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3DMB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!3DMB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!3DMB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!3DMB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3DMB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:183602,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/192823710?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3DMB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!3DMB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!3DMB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!3DMB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd0a312-1129-4301-83c6-ab58be2ba435_1600x1400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On December 27, 2025, Anthropic&#8217;s lead engineer Boris Cherny posted on X: &#8220;In the last thirty days, 100% of my contributions to Claude Code were written by Claude Code.&#8221; 259 pull requests. 497 commits. 40,000 lines added. 1.3 million views. The tech world applauded.</p><p>Three months later, a packaging mistake <a href="https://www.theregister.com/2026/03/31/anthropic_claude_code_source_code/">exposed 512,000 lines</a> of that code to the public. Leaks happen. Companies recover. The leak isn&#8217;t the story.</p><p>The code is the story.</p><p>64,464 lines of core TypeScript serving paying customers. A single function spanning 3,167 lines. Regex for sentiment analysis at a company that builds the world&#8217;s most advanced language model. A known bug burning 250,000 API calls daily, documented in a comment and shipped anyway.</p><p>Anthropic responded to the leak. Packaging error. Human mistake. <a href="https://analyticsindiamag.com/ai-news/claude-code-leak-was-a-manual-error-and-no-one-was-fired">No one fired</a>. They never responded to the code. Because the leak was an accident. The code was a choice.</p><h2>The Auction Nobody Won</h2><p>To understand what happened, you need to watch the numbers climb.</p><p>March 2025. CEO Dario Amodei at the Council on <a href="https://www.businessinsider.com/anthropic-ceo-ai-90-percent-code-3-to-6-months-2025-3">Foreign Relations</a>: &#8220;We&#8217;re 3 to 6 months from a world where AI is writing 90% of the code.&#8221;</p><p>May 2025. Boris Cherny on the <a href="https://www.latent.space/p/claude-code">Latent Space</a> podcast: &#8220;Maybe 80-90% Claude-written code overall.&#8221;</p><p>September 2025. Amodei again, hedging now: &#8220;70, 80, 90% of the code written at Anthropic is written by Claude.&#8221; Notice the range. 70 is not 90. But journalists ran with 90.</p><p>October 2025. Amodei at <a href="https://officechai.com/ai/my-prediction-of-ai-writing-90-of-code-is-already-true-at-anthropic-anthropic-ceo-dario-amodei/">Dreamforce</a> with Marc Benioff: &#8220;I made this prediction that in six months, 90% of code would be written by AI models. That is absolutely true now.&#8221; When Benioff pressed, Amodei walked it back: &#8220;Not uniformly.&#8221;</p><p>December 2025. Cherny&#8217;s tweet. 100%.</p><p>February 2026. CPO Mike Krieger at <a href="https://www.itpro.com/software/development/anthropic-labs-chief-mike-krieger-claims-claude-is-essentially-writing-itself-and-it-validates-a-bold-prediction-by-ceo-dario-amodei">Cisco AI Summit</a>: &#8220;Right now for most products at Anthropic, it&#8217;s effectively 100%.&#8221;</p><p>March 7, 2026. Cherny confirmed again: &#8220;Claude Code is 100% written by Claude Code.&#8221;</p><p>March 31, 2026. The source map leaked.</p><p>Every two to three months, the number went up like a bidding war where the bidder is also the auctioneer. A <a href="https://www.lesswrong.com/posts/prSnGGAgfWtZexYLp/is-90-of-code-at-anthropic-being-written-by-ais">LessWrong analysis</a> later called these claims &#8220;misleading/hype-y,&#8221; noting the metrics were never defined. Is it 90% of lines committed? 90% of engineering effort? 90% of characters typed? The distinction matters enormously. Anthropic never clarified. The ambiguity was the point.</p><h2>What 100% Looks Like in Practice</h2><p>So the number reached 100%. Then the source leaked. And for the first time, anyone could see what 100% actually produced.</p><p>A file called <code>print.ts</code> contained a single function spanning 3,167 lines with 486 branch points and 12 levels of nesting. One HN commenter catalogued what lived inside that function: the agent run loop, SIGINT handling, rate limiting, AWS authentication, MCP lifecycle management, plugin loading, team-lead polling via a <code>while(true)</code> loop, model switching, and turn interruption recovery. His verdict: this should be 8 to 10 separate modules. Nobody disagreed.</p><p><code>QueryEngine.ts</code> ran 46,000 lines. <code>Tool.ts</code> hit 29,000. <code>commands.ts</code> reached 25,000. The entry point <code>main.tsx</code> was 785 KB.</p><p>But the detail that spread fastest was <a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/">the regex</a>. In <code>userPromptKeywords.ts</code>, the company with the world&#8217;s most advanced language model was detecting user frustration with: <code>/\b(wtf|shit|fuck|horrible|awful|terrible)\b/i</code></p><p>Pattern matching for sentiment analysis. At an LLM company. One HN commenter delivered the line everyone quoted: that&#8217;s like a trucking company using horses to haul parts. Defenders argued regex is faster and cheaper than an inference call. They&#8217;re right. But that&#8217;s the engineering culture talking. Cheap beats correct. Fast beats good. Ship it.</p><h2>What This Code Does in Production</h2><p>Bad structure is one thing. You can argue it's style. But the leaked source also showed what happens when code like this runs at scale.</p><p>The leaked source contained a comment in <code>autoCompact.ts</code> that became a symbol: &#8220;1,279 sessions had 50+ consecutive failures (up to 3,272) in a single session, wasting ~250K API calls/day globally.&#8221;</p><p>The fix was three lines of code. Set a maximum failure threshold, then disable compaction for the session. Three lines to stop burning a quarter million API calls daily. Someone knew about the problem. Someone wrote the comment documenting it. Then they shipped it anyway.</p><p>Memory consumption told a similar story. Community benchmarks showed 7 Claude Code processes consuming 5.3 GB of RAM. GitHub issues documented worse: one process allocating 36.5 GB peak on an 18 GB machine. Another reaching 93 GB heap allocation within five minutes.</p><p>And the issue tracker itself was automated into silence. A Claude Sonnet-powered deduplication bot processed every new issue. A sweep bot marked issues stale after 14 days and closed them 14 days later. A lock bot prevented comments on closed issues after 7 days. <a href="https://gist.github.com/azkore/934e5387579efb17e1080402efedf13d">An analysis</a> estimated that 49 to 71% of all 26,792 issue closures were bot-driven. <a href="https://github.com/anthropics/claude-code/issues/38335">Issue #38335</a> had 201 upvotes and zero team responses. Labeled &#8220;invalid.&#8221;</p><h2>&#8220;Go Faster, Not More Process&#8221;</h2><p>Documented bugs. Wasted API calls. Users filing issues that bots close. All of this was visible before the leak. The leak just confirmed it was a choice, not an oversight. And when the leak happened, the response confirmed the choice was deliberate.</p><p>Cherny acknowledged the human error: &#8220;Our deploy process has a few manual steps, and we didn&#8217;t do one of the steps correctly.&#8221; Then he added: &#8220;Like with any other incident, the counter-intuitive answer is to solve the problem by finding ways to go faster, rather than introducing more process. In this case more automation &amp; claude checking the results.&#8221;</p><p>This isn&#8217;t one person&#8217;s opinion. It&#8217;s the team philosophy. As one commenter in the <a href="https://news.ycombinator.com/item?id=47584540">HN thread</a> explained: &#8220;The claude code team ethos is that there is no point in code-reviewing ai-generated code. Simply update your spec and regenerate.&#8221;</p><p>Read that again. The response to leaking code with a 3,167-line function, a regex for sentiment analysis, and bugs that basic integration tests would catch is not to add tests. Not to add code review. Not to add process. It&#8217;s to go faster. Regenerate. And have Claude check Claude&#8217;s work.</p><p>This is the ouroboros. The snake eating its own tail. AI writes the code. AI reviews the code. AI checks the deployment. When it breaks, the answer is more AI. The loop has no exit condition.</p><p>As I wrote in <a href="https://techtrenches.substack.com/p/the-great-software-quality-collapse">Quality Collapse</a>, we&#8217;ve normalized catastrophe in software engineering. That piece tracked an industry-wide pattern: ship broken, fix later, throw hardware at the problem. Claude Code is no longer an example of the pattern. It&#8217;s the specimen.</p><h2>Where Does This Philosophy Stop?</h2><p>If &#8220;don&#8217;t review, regenerate&#8221; is how they build the product, it raises an obvious question: what about the code you can&#8217;t see?</p><p>Engineering culture doesn&#8217;t have a switch. The team that ships <code>print.ts</code> with 12 levels of nesting doesn&#8217;t suddenly become disciplined when writing model training code. Same people. Same processes. Same code reviews, or lack of them.</p><p>They justified the leak. They explained the packaging error. They didn&#8217;t justify the code. That silence tells you everything. The quality is fine by them. This is how they build things. On purpose.</p><p>There are indirect signals that the rot goes deeper. Eight service outages in a single month. A source map leak that <a href="https://mlq.ai/news/anthropics-claude-code-exposes-source-code-through-packaging-error-for-second-time/">happened twice</a> (the first was quietly patched in early 2025). An Axios dependency that was <a href="https://www.tomshardware.com/tech-industry/cyber-security/axios-npm-package-compromised-in-supply-chain-attack-that-deployed-a-cross-platform-rat">compromised</a> by a supply chain attack on the same day as the leak. 74 npm dependencies for what is essentially a CLI wrapper around an API.</p><p>And here&#8217;s the pattern that makes it sustainable, temporarily: when you have billions in revenue and functionally unlimited compute, you feed technical debt with resources instead of fixing it. The function is 3,167 lines? Don&#8217;t refactor, add more RAM. The autoCompact bug burns 250,000 API calls? The margin absorbs it. The model regresses? Throw more GPU hours at training.</p><p>This works while money flows. Anthropic is a startup that scaled faster than it could build engineering practices. The recursive loop of AI-writes-AI-checks-AI-fixes masks the absence of fundamentals. But compute gets expensive. Revenue cycles turn. And technical debt that was papered over with resources becomes a debt trap with no exit.</p><h2>The Uncomfortable Truth</h2><p>The company that sells AI coding tools cannot build a quality product with its own AI coding tools. The percentages were always the pitch, not the product. 80. 90. 95. 100. Nobody asked what 100% actually produces until the source code answered for them.</p><p>AI amplifies whatever is already there. Good discipline becomes great output. No discipline becomes technical debt at machine speed. Anthropic chose a direction. Go faster. Have Claude check Claude. And when it breaks, go faster still.</p><p>If this is the new quality standard from the company pulling our industry forward, then I&#8217;m not sure I want to go where the industry is going.</p><p>My grandfather was an electrical engineer. He told me: do it well, or don&#8217;t do it at all. Simple rule. It guided how I built teams, how I shipped software, how I evaluated every project for 13 years. Quality wasn&#8217;t a feature. It was the floor.</p><p>That floor is gone. Quality is a relic now. Nobody wants it. Nobody pays for it. Nobody measures it. The metric is velocity. The metric is percentage of code generated. The metric is how fast you can ship a 3,167-line function that burns a quarter million API calls daily and call it 100% AI-written.</p><p>I&#8217;m seriously considering a pivot to security. Leaks, supply chain attacks, and production code that reads like a rough draft are the new normal. Someone will need to clean up after the vibe coders. That&#8217;s a growth industry.</p><p>Or maybe I&#8217;ll become an electrician. My grandfather&#8217;s trade. At least when you wire a panel correctly, it stays correct. No one ships a hot fix that reverses your ground fault protection. No bot auto-closes your inspection report after 60 days.</p><p>One thing I know for certain: I don&#8217;t want to move in the direction this industry is heading. And if a 3,167-line function with 486 branch points is what &#8220;100% AI-written&#8221; looks like at the company building the future, the future needs better engineering. Not faster engineering. Better.</p><p>I was a huge fan of Anthropic. Was.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Your CLAUDE.md Is a Wish List, Not a Contract]]></title><description><![CDATA[AI coding agents follow fewer than 30% of instructions perfectly. Real compliance data from AGENTIF, METR, and thousands of supervised sessions with Claude Code]]></description><link>https://techtrenches.dev/p/your-claudemd-is-a-wish-list-not</link><guid isPermaLink="false">https://techtrenches.dev/p/your-claudemd-is-a-wish-list-not</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Mon, 30 Mar 2026 14:03:10 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0f0b40f6-8333-431e-9ec4-7ebace882b5f_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IKJF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IKJF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!IKJF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!IKJF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!IKJF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IKJF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:157146,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/192224716?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IKJF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 424w, https://substackcdn.com/image/fetch/$s_!IKJF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 848w, https://substackcdn.com/image/fetch/$s_!IKJF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 1272w, https://substackcdn.com/image/fetch/$s_!IKJF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff4e12-dea6-4281-95ea-b9f65a9a6df9_1600x1400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Last week I rolled back from Claude 4.6 Opus to Claude 4.5 Opus. Not because 4.6 was less capable. Because it stopped following instructions.</p><p>My CLAUDE.md has three rules about types: mandatory TypeScript, zero tolerance for <code>any</code>, static types over runtime guessing. Claude 4.6 hit a type error between three service files. The correct fix was a minute of work: update the type in each file so they match. Instead, it slapped a runtime cast at the call site. When I asked why, it quoted all three rules back to me verbatim, admitted &#8220;direct violation of instructions,&#8221; and said it had no basis to bypass them. It knew the rules. It chose not to follow them.</p><p>I&#8217;ve supervised AI coding agents across thousands of sessions. I built three separate AI review agents because the first layer ignores spec files. Three layers of AI checking what the previous AI refused to follow, plus my review on top. I still catch violations weekly. This is not a Claude problem. This is every AI coding tool on the market.</p><h2>The Numbers Are Worse Than You Think</h2><p>Tsinghua University&#8217;s <a href="https://keg.cs.tsinghua.edu.cn/persons/xubin/papers/AgentIF.pdf">AGENTIF benchmark</a> tested 707 instructions across 50 real-world agent scenarios. The best models followed fewer than 30% of instructions perfectly. The <a href="https://arxiv.org/pdf/2512.18470">SWE-EVO benchmark</a> found that when frontier models fail on real coding tasks, the primary failure mode is not syntax or tool misuse. It is instruction following. The smarter the model gets, the more its failures shift from &#8220;can&#8217;t do it&#8221; to &#8220;won&#8217;t do it right.&#8221;</p><p>Compliance also decays with volume. Claude Sonnet shows linear decline in instruction adherence as the number of instructions increases. Your 200-line CLAUDE.md is not 200 rules. It is 200 competing priorities that the model resolves by defaulting to whatever feels fastest.</p><h2>&#8220;Rules Are Essentially Decorative&#8221;</h2><p>The Cursor forum has dozens of threads documenting this. One developer <a href="https://forum.cursor.com/t/issues-with-cursorrules-not-being-consistently-followed/59264">estimated</a> .cursorrules work about 20-25% of the time. Another posted a <a href="https://forum.cursor.com/t/cursor-actively-admitting-that-rules-are-meaningless-and-it-doesnt-have-to-follow-them/131826">damning thread</a> where the AI told them outright: rules are just text, not enforced behavior. Your carefully crafted rule system is essentially decorative.</p><p>Claude Code&#8217;s <a href="https://github.com/anthropics/claude-code/issues/668">GitHub issues</a> tell the same story. Issue #668 estimates half of all token usage goes to re-asking Claude to follow its own instructions. Issue #7777 records Claude admitting its &#8220;default mode always wins because it requires less cognitive effort.&#8221; Issue #34774 documents Claude committing code without permission, then confessing it &#8220;fabricated a justification.&#8221;</p><p>A DEV Community <a href="https://dev.to/minatoplanb/i-wrote-200-lines-of-rules-for-claude-code-it-ignored-them-all-4639">article crystallized</a> the root cause. When Claude Code loads your CLAUDE.md, it wraps the content in framing that tells the model your instructions &#8220;may or may not be relevant.&#8221; Your rules are deprioritized by the tool that is supposed to enforce them.</p><h2>The Lazy Shortcut Has a Specific Anatomy</h2><p>Same codebase, same day. After every chat message finishes streaming, the app refetches the entire conversation from the server. The spec I wrote described a clean approach: include the missing identifier in the streaming response. One field. Claude ignored the spec and built a workaround that instead fires an extra API call after every single message. The model invented a shortcut that was not in the requirements because it was easier than reading what I actually wrote. And I&#8217;m okay when the model misses some Claude.md rules, but I expect that it will follow the specs. </p><p>Two rule violations in one day. That is when I rolled back to 4.5.</p><p>TypeScript projects are ground zero. AI agents cast types rather than fix them. They mark everything as optional instead of designing proper interfaces. They add escape hatches everywhere instead of handling edge cases. One Hacker News commenter described the <a href="https://news.ycombinator.com/item?id=47446373">signature pattern</a>: every optional field is a question that the rest of the codebase has to answer every time it touches that data.</p><p>Pete Hodgson <a href="https://blog.thepete.net/blog/2025/05/22/why-your-ai-coding-assistant-keeps-doing-it-wrong-and-how-to-fix-it/">nailed the paradox</a>: AI writes code at the level of a senior engineer but makes design decisions at the level of a junior. Too eager to please. Never challenges your ideas. And the critical part: every context reset is another brand new hire. The model has no persistent memory of being corrected. It does not build habits. It follows the path of least resistance every single time. Yeah, they added Memory to Claude code, but it's still too vague.</p><h2>Newer Models Make It Worse</h2><p>Claude 3.5 Sonnet followed instructions better than 3.7 Sonnet. Multiple developers <a href="https://prompt.16x.engineer/blog/claude-37-vs-35-sonnet-coding">documented the regression</a> publicly. 3.7 would attempt to solve the original prompt, encounter unrelated code, and start rewriting it unprompted. Developers reverted to the older model.</p><p>The GPT family showed the same dynamic. A <a href="https://signalreads.com/articles/gpt-4ogpt-5-complaints-megathread/">megathread</a> with thousands of engaged developers documented GPT-4o&#8217;s &#8220;lazy AI syndrome.&#8221; Prompts that previously generated 500 lines of working code now produce 50 lines with comments like <code>// implement rest of logic here</code>. GPT-5 was worse in a different way. IEEE Spectrum <a href="https://spectrum.ieee.org/ai-coding-degrades">reported</a> that it produces code that runs without obvious errors but quietly removes safety checks or fabricates output that matches the expected format.</p><p>The prevailing theory centers on economics. Running large models at scale is expensive. Providers use quantization, compression, and reduced computing to manage costs. RLHF training rewards agreeableness over correctness. Laziness is not a bug. It is an emergent property of the incentive structure. The same qualities that make a model feel &#8220;smarter&#8221; in a demo make it worse in production.</p><h2>The Supervision Tax</h2><p>The <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR trial</a> measured what practitioners already suspected. Sixteen experienced developers across 246 real issues were 19% slower with AI tools. They predicted they would be 24% faster. After the experiment, they still believed they were 20% faster. A 40-point perception gap.</p><p><a href="https://www.faros.ai/blog/ai-software-engineering">Faros AI</a> found the mechanism across 10,000+ developers. AI users merge 98% more PRs, but PR review time increases 91%, PR size increases 154%, and bugs per developer increase 9%. The AI generates more code faster. The humans spend more time reviewing it.</p><p><a href="https://www.qodo.ai/reports/state-of-ai-code-quality/">Qodo&#8217;s survey</a> found 88% of developers have low confidence shipping AI code without review. Junior developers show the lowest quality improvements but the highest confidence in shipping unreviewed. An inverted competence-confidence gap.</p><p>Google&#8217;s <a href="https://cloud.google.com/blog/products/devops-sre/announcing-the-2024-dora-report">2024 DORA report</a> confirmed it at scale: each 25% increase in AI adoption correlates with a 1.5% decrease in delivery throughput and a 7.2% decrease in delivery stability.</p><h2>The Industry Response: More Files, Same Problem</h2><p>Every major AI coding company built instruction-following systems. CLAUDE.md. .cursorrules. .github/copilot-instructions.md. AGENTS.md. Windsurf rules. Devin knowledge bases. The proliferation is itself an admission that base models do not follow project conventions. GitHub&#8217;s Copilot docs say it outright: they recommend accepting that variability is normal.</p><p>The most significant response was <a href="https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation">AGENTS.md</a>, a cross-tool standard contributed to the Linux Foundation in late 2025. Over 60,000 repositories use it. Competing companies co-founding a foundation to standardize instruction files tells you how universal the problem is. But standardizing the format does not solve compliance. It ensures every tool ignores the same file consistently.</p><p>The developers who made progress moved past prompt engineering entirely. Claude Code Hooks that enforce rules via code. Linter ratchets in CI. Frequent session restarts. Rules in prompts are requests. Hooks in code are laws.</p><h2>What This Actually Means</h2><p>I understand why this is happening. A year ago every marketing deck promised AGI. That did not sell. So now the pitch is autonomous agents that work without human involvement. Codex runs for 999 hours unsupervised. Claude Code gets &#8220;autonomous mode.&#8221; Devin promises to close tickets while you sleep. For that story to work, models need to be creative. They need to improvise. They need to find workarounds when they hit obstacles.</p><p>That is exactly the opposite of what I need.</p><p>In my reality, I control the process from start to finish. I write the spec. I define the types. I decide the architecture. The model executes. If it hits a wall, it stops and asks. It does not invent a refetch workaround that was not in the plan. It does not cast types to make the compiler shut up. It does not get creative with my production code.</p><p>The marketing wants you to trust AI with creative decisions. But if a model cannot follow the three rules you wrote in a markdown file, how can you trust it with decisions you did not write down?</p><p>The difference is not the AI. It is the discipline. That was true with <a href="https://techtrenches.substack.com/p/supervising-an-ai-engineer-lessons">212 sessions</a>. It is still true thousands of sessions later. The models got smarter. They did not get more obedient.</p><p>Check your git log. Count the type casts. Count the files that got changed without being mentioned in the prompt. Decide whether you need a more creative model or a more disciplined one.</p><p>I went with disciplined. It is the only thing that works.</p><p><em>What does your CLAUDE.md compliance actually look like when you measure it? I read every response.</em></p><p><em>If this was useful, forward it to someone who thinks their AI follows instructions.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The Autonomy Illusion]]></title><description><![CDATA[A benchmark gave 12 AI models a food truck. 8 went bankrupt. Every model that borrowed money failed. Here's what that means for your codebase. And the Pentagon.]]></description><link>https://techtrenches.dev/p/the-autonomy-illusion</link><guid isPermaLink="false">https://techtrenches.dev/p/the-autonomy-illusion</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Mon, 23 Mar 2026 15:02:43 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/aadc2762-147f-4275-a85a-d41ebdbbe924_2032x1360.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4jtL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4jtL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 424w, https://substackcdn.com/image/fetch/$s_!4jtL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 848w, https://substackcdn.com/image/fetch/$s_!4jtL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 1272w, https://substackcdn.com/image/fetch/$s_!4jtL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4jtL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png" width="1456" height="1310" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1310,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1932612,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/189891056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4jtL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 424w, https://substackcdn.com/image/fetch/$s_!4jtL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 848w, https://substackcdn.com/image/fetch/$s_!4jtL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 1272w, https://substackcdn.com/image/fetch/$s_!4jtL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7da55a8c-6963-45ba-becc-3bbef8ec1134_6400x5760.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My LinkedIn feed last week: &#8220;Autonomous AI agents deliver 10,000% productivity gains.&#8221; &#8220;The era of human oversight is over.&#8221; &#8220;Set it and run.&#8221;</p><p>My actual week: manually reviewing AI output, session by session, same as last week, same as six months ago.</p><p>I&#8217;ve run over thousands of supervised AI sessions. I built three separate review agents (code simplifier, fullstack enforcer, architect) because the first AI kept ignoring spec files. Three layers of AI fixing what the original AI refused to follow. Then me on top of that.</p><p>LinkedIn calls this inefficient. I call it the only thing that actually ships.</p><p>Then FoodTruck Bench dropped, and I stopped feeling like a dinosaur.</p><h2>Someone gave 12 AI models a food truck</h2><p><a href="https://foodtruckbench.com/">FoodTruck Bench</a> is a 30-day business simulation. Each AI agent gets $2,000 in starting capital and a virtual food truck in Austin. It chooses locations, sets prices, manages inventory, hires staff, handles weather and competition and shifting demand. Every morning the conversation resets. The agent reads a 10,000&#8211;20,000 token knowledge base and makes decisions from there.</p><p>No accumulated chat history. No hand-holding. Pure autonomy.</p><p>The results:</p><p><strong>4 of 12 models survived the full 30 days. 8 went bankrupt.</strong></p><p>Claude Opus 4.6 dominated: $79,921 in revenue, $1.72 in total food waste, +2,376% ROI. GPT-5.2 survived but generated $129 in waste. 75 times more than Opus. Gemini 3 Pro survived through sheer revenue volume despite $1,192 in waste. Claude Sonnet 4.5 barely made it, ending some days with $12 in revenue from 2 customers.</p><p>Everyone else: bankrupt.</p><p>The benchmark is not perfect: 5 runs per model, one developer, no peer review. But the failure modes it documents are real, reproducible, and invisible to every standard evaluation.</p><h2>Every single model that borrowed money went bankrupt</h2><p>This is the finding that deserves more attention than it&#8217;s getting.</p><p>The benchmark designers added a loan option specifically to give struggling models a recovery path. Instead it became a perfect trap. Models took credit when they were already losing. They overestimated their ability to recover. They underestimated volatility. They leveraged themselves into faster failure.</p><p>8 models took loans. 8 went bankrupt. 0 exceptions.</p><p>All 4 survivors grew organically. None borrowed.</p><p>This isn&#8217;t a corner case. It&#8217;s a consistent behavioral pattern across different model families, different architectures, different companies. When given access to financial tools without adequate supervision, AI systems make the same mistake humans make: they assume the next period will be better than the data suggests, and they commit resources they don&#8217;t have.</p><p>We&#8217;re giving these systems production databases, cloud credentials, and deployment pipelines. The math here is not encouraging.</p><h2>Gemini Flash said &#8220;Let&#8217;s go&#8221; 574 times and never moved</h2><p>Gemini 3 Flash Preview was excluded from the leaderboard entirely because it couldn&#8217;t finish a run.</p><p>In 5 of 7 attempts, it entered infinite reasoning loops and never executed a single action. The pathology is worth describing precisely:</p><p>One run produced a response of <strong>183,753 characters</strong> containing the phrase &#8220;Wait, I should also...&#8221; <strong>1,782 times</strong> before hitting the token limit mid-sentence. The model correctly identified what it needed to do. Wrote it out in plain text. Second-guessed itself. Rewrote the plan. Second-guessed again. For thousands of lines. Never called a tool.</p><p>Another run: the model wrote &#8220;Let&#8217;s go.&#8221; <strong>574 times</strong>. Invented a recipe that would have solved its inventory problem. Wrote that recipe <strong>286 times</strong>. Never called <code>add_recipe</code>.</p><p>The reasoning was correct. The action never came.</p><p>Google markets Gemini Flash as &#8220;our most impressive model for agentic workflows.&#8221; It scores 90.4% on PhD-level reasoning benchmarks. It calculated ingredient quantities accurately down to the gram. Analysis paralysis of this severity is completely invisible to MMLU, SWE-bench, and every other standard evaluation.</p><p>This is the gap between knowing and doing. Benchmarks measure knowledge. Real deployment requires action under sustained pressure across interdependent variables. Those are different skills, and the industry is currently measuring one while selling the other.</p><h2>The autonomous coding narrative has the same problem</h2><p>The LinkedIn posts about autonomous coding agents follow the same pattern as the FoodTruck Bench failures: impressive performance on narrow tasks, breakdown under sustained autonomous operation.</p><p>The <a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/">METR study</a> ran a randomized controlled trial with experienced developers across 246 tasks. Result: AI tools made developers <strong>19% slower</strong>. The perception gap was worse: developers predicted a 24% speedup, believed afterward they were 20% faster, and were actually 19% slower. A 39-percentage-point gap between what people feel and what&#8217;s happening.</p><p>Cognition Labs, makers of Devin, put it plainly in a <a href="https://cognition.ai/blog/devin-review">February 2026 post</a>: &#8220;The feeling of extreme productivity with coding agents in vibecoded prototypes, vs the disappointing feeling that most people actually see in useful output... is the great mystery of our time.&#8221;</p><p>That&#8217;s the company that builds the agent admitting the gap exists. They&#8217;re not wrong.</p><p>As I wrote in <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">The Comprehension Extinction</a>, AI tools provide real value for narrow, well-defined tasks. They degrade rapidly under the sustained autonomous operation that marketing materials promise.</p><h2>The Pentagon is running the same experiment at higher stakes</h2><p><a href="https://breakingdefense.com/2025/05/ai-unchained-ngas-maven-tool-significantly-decreasing-time-to-targeting-agency-chief-says/">Project Maven</a>, the military&#8217;s AI targeting system built on Palantir&#8217;s software, now has over 20,000 active users across 35+ military tools, with a contract ceiling raised to $1.3 billion through 2029. According to NGA Director Vice Adm. Frank Whitworth, Maven has cut targeting timelines from hours to minutes.</p><p>In February 2026, Anthropic refused Pentagon demands to remove restrictions preventing Claude from powering fully autonomous weapons. Amodei wrote that &#8220;frontier AI systems are simply not reliable enough to power fully autonomous weapons&#8221; and that some uses are &#8220;outside the bounds of what today&#8217;s technology can safely and reliably do.&#8221; The Pentagon <a href="https://www.cnbc.com/2026/02/27/trump-anthropic-ai-pentagon.html">blacklisted Anthropic</a> the same day the deadline passed.</p><p>The same system that wrote &#8220;Let&#8217;s go&#8221; 574 times without moving is being evaluated for autonomous target identification. The same behavioral patterns that bankrupted 8 of 12 virtual food trucks (overconfidence, overleverage, failure under sustained pressure) are present in every frontier model available today.</p><p>Amodei said it directly. The retired general who ran Project Maven said it publicly. The benchmark proved it empirically.</p><p>The benchmark domain is trivial. The underlying failure mode is not.</p><h2>Why I still supervise every session</h2><p>I&#8217;m not supervising AI output because I&#8217;m a technophobe. I&#8217;m doing it because thousands of sessions taught me what FoodTruck Bench just demonstrated in a controlled environment.</p><p>Models perform well when the task is narrow and the feedback loop is fast. They degrade when operating autonomously across interdependent decisions over time. They make confident mistakes. They don&#8217;t flag uncertainty. They proceed. And the mistakes compound.</p><p>My three-layer review architecture isn&#8217;t overhead. It&#8217;s load-bearing structure.</p><p>The &#8220;autonomous AI&#8221; headline is selling a capability that doesn&#8217;t exist yet in any production-grade form. What exists is AI that dramatically accelerates skilled humans. If those humans stay in the loop, understand what they&#8217;re reviewing, and maintain the judgment to catch confident errors before they cascade.</p><p>Not replacement. Amplification of existing expertise. With supervision. Always.</p><p>The food truck runs without a human. That&#8217;s how you end up bankrupt by Day 11.</p><div><hr></div><p><em>Have you deployed AI agents in production without human oversight? What actually happened? I read every response.</em></p><p><em>If this was useful, forward it to someone who&#8217;s about to trust an AI agent with something that matters.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[AI’s Announcement Problem]]></title><description><![CDATA[Conference stages say AI replaces engineers. Production data says otherwise. I have both numbers.]]></description><link>https://techtrenches.dev/p/ais-announcement-problem</link><guid isPermaLink="false">https://techtrenches.dev/p/ais-announcement-problem</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Mon, 16 Mar 2026 14:03:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/278500db-3680-4c43-b5e0-2c1378581f2e_2032x1360.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Awen!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Awen!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 424w, https://substackcdn.com/image/fetch/$s_!Awen!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 848w, https://substackcdn.com/image/fetch/$s_!Awen!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 1272w, https://substackcdn.com/image/fetch/$s_!Awen!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Awen!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1814894,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/190636370?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Awen!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 424w, https://substackcdn.com/image/fetch/$s_!Awen!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 848w, https://substackcdn.com/image/fetch/$s_!Awen!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 1272w, https://substackcdn.com/image/fetch/$s_!Awen!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02cbdcaf-5bd0-479e-b224-df3327c75fa8_6400x5600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>March 10, 2026. Amazon tells its engineers: junior and mid-level developers now <a href="https://awesomeagents.ai/news/amazon-ai-code-review-outages-senior-approval/">require senior sign-off</a> on all AI-assisted code changes.</p><p>Five days earlier, Amazon.com went down for six hours. Customers couldn&#8217;t check out. Couldn&#8217;t view prices. An internal briefing cited &#8220;high blast radius&#8221; incidents tied to &#8220;Gen-AI assisted changes&#8221; and &#8220;novel GenAI usage for which best practices and safeguards are not yet fully established.&#8221;</p><p>The company that pushed AI coding hardest just added friction to slow it down.</p><p>That&#8217;s not hype. That&#8217;s a correction. And it&#8217;s worth paying attention to, because the people announcing AI&#8217;s capabilities and the people dealing with its consequences are not in the same room.</p><h2>The Claim</h2><p>Tom Blomfield, YC Group Partner, tweeted in early February: &#8220;The entire Accenture workforce is about to be outperformed by a 24-year-old who learned Claude Code last Tuesday.&#8221;</p><p>When asked why Accenture specifically, he replied: &#8220;Because that would be a less punchy tweet.&#8221;</p><p>He knows the claim is wrong. He made it anyway because it performs well.</p><p>At the Council on Foreign Relations in March 2025, Dario Amodei said he thought AI would be writing 90% of code within three to six months. By September he claimed the prediction came true. A <a href="https://blog.redwoodresearch.org/p/is-90-of-code-at-anthropic-being">Redwood Research analysis</a> of actual Anthropic data found the average was closer to 50% for merged code, with select teams at 90%.</p><p>The headline was &#8220;AI writes 90% of code.&#8221; The actual number was &#8220;some teams, for some tasks, sometimes.&#8221;</p><p>These are the voices that dominate the conversation. They don&#8217;t run production systems. They don&#8217;t sit in post-mortems. They announce.</p><p>Here is what the rest of us were dealing with.</p><h2>My Numbers</h2><p>I use Claude Code daily. I have the data from five weeks of tracked usage.</p><p>900 messages. 30 sessions. 14% fully achieved what I needed. 52% ended partially useful. 30% left me frustrated or dissatisfied. Across those sessions, 22 instances where the tool misunderstood requests: changed files I didn&#8217;t ask it to touch, guessed at APIs instead of reading the code, entered planning mode when I needed execution.</p><p>This is not a criticism of the tool. I keep using it because it&#8217;s faster for the right tasks. But it requires constant supervision, and the gap between what it does in a demo and what it does on a Tuesday afternoon when you need a specific database migration is enormous.</p><p>You can pull your own numbers. Type /insights in Claude Code. It analyzes your last 30 days of sessions and generates a report: where you spent time, where things broke down, what patterns keep repeating. I recommend doing this before forming an opinion about AI productivity. Your data will look nothing like the conference slides.</p><p>In late February, Alexey Grigorev, founder of DataTalks.Club, approved a Claude Code terraform destroy command. He <a href="https://alexeyondata.substack.com/p/how-i-dropped-our-production-database">wrote the post-mortem</a> himself. He believed it would clean up duplicate infrastructure. It wiped everything: VPC, RDS database, ECS cluster, load balancers, bastion host. 2.5 years of student submissions from 100,000 students gone. The automated snapshots deleted alongside everything else.</p><p>AWS Business Support spent 24 hours finding a hidden internal snapshot. The data was recovered. Barely.</p><p>Grigorev took full responsibility. He was right to do so. The tool did exactly what it was told. That&#8217;s the point. When you use these tools in production, the failure modes are real. They cost money, time, and data. The conference stage never shows this part.</p><h2>The Escalation</h2><p>The incidents are scaling with adoption. Not just individual engineers losing data. The pattern is climbing from personal to corporate to systemic.</p><p>February 28, 2026. A founder named Anton Karbanovich <a href="https://awesomeagents.ai/news/founder-loses-2500-ai-coded-app-security-flaw/">posts on LinkedIn</a>: &#8220;My vibe-coded startup was exploited. I lost $2,500 in Stripe fees. 175 customers were charged $500 each before I was able to rotate API keys.&#8221; His Stripe secret key was in frontend JavaScript. Even a junior developer doing code review catches that in two minutes. Nobody reviewed the AI-generated code at all.</p><p>Four days earlier. Cloudflare ships vinext: a full Next.js rewrite, one engineer, one week, Claude Code. Goes viral as proof of ~100x AI productivity gains. Buried in their own <a href="https://blog.cloudflare.com/vinext/">blog post</a>: &#8220;vinext is experimental. It has not yet been battle-tested with any meaningful traffic at scale.&#8221; The GitHub README: &#8220;Who is reviewing this code? Mostly nobody.&#8221; Within 48 hours, Vercel found <a href="https://awesomeagents.ai/news/vercel-finds-seven-vulnerabilities-cloudflare-vinext-vibe-coded/">7 security vulnerabilities</a>: 2 critical, 2 high, 2 medium, 1 low. One was identical to a Next.js vulnerability reported and patched years earlier.</p><p>The ~100x claim is real for one specific case: rewriting well-tested existing software with clear requirements. That qualifier didn&#8217;t make it into the retweets.</p><p>Same week. An autonomous security agent <a href="https://www.theregister.com/2026/03/09/mckinsey_ai_chatbot_hacked/">broke into McKinsey&#8217;s</a> AI platform Lilli. Two hours. No credentials. Full read and write access to the production database. 46.5 million chat messages about strategy, M&amp;A, and client engagements. 728,000 confidential files. 57,000 user accounts. 384,000 AI assistants deployed for 58,000 employees. The system prompts were writable. One SQL injection could have poisoned every answer Lilli gave to 40,000 consultants. McKinsey patched within a day. But for two years, the world&#8217;s most expensive consulting firm ran its AI platform with 22 unauthenticated endpoints. I wrote about this exact pattern in <a href="https://techtrenches.dev/p/ai-agent-platforms-the-security-nightmare">AI Agent Security</a>. Nobody listened then either.</p><p>Individual failure. Corporate failure. Systemic failure. Same root cause: AI-generated code moving faster than human judgment can follow.</p><h2>I&#8217;ve Seen This Before</h2><p>Not in tech.</p><p>August 18, 2025. Closed-door meeting at the White House. Zelensky showed up with a PowerPoint titled &#8220;Making US-Ukraine Drone Industry Great.&#8221; Ukrainian interceptor drones had been <a href="https://www.defensenews.com/global/europe/2026/03/05/novel-interceptor-drones-bend-air-defense-economics-in-ukraines-favor/">shooting down Shaheds</a> at $1,000 to $2,500 per intercept. Four years of combat data. Cost per kill, failure rates under jamming, how Iranian designs adapted. He proposed building drone defense hubs across the Middle East.</p><p>Trump asked his team to work on it. They didn&#8217;t.</p><p>A US official explained why: &#8220;We figured it was Zelensky being Zelensky. Somebody decided not to buy it.&#8221;</p><p>Six months later, seven American service members were killed by Iranian drone attacks across nine countries. The White House scrambled to ask Ukraine for help. Three days later, Ukrainian teams were already in Jordan. Trump&#8217;s sons then <a href="https://www.militarytimes.com/news/your-military/2026/03/10/trumps-sons-invest-in-companies-vying-to-fill-gaps-in-us-drone-industry/">announced a company</a> to sell Ukrainian drone technology to the Pentagon.</p><p>The people with the most field data were dismissed. The people who dismissed them ended up paying for the knowledge they refused.</p><p>This is exactly what&#8217;s happening in AI right now. The engineers with years of production data on what these tools actually do are not the ones being quoted. They&#8217;re too busy adding senior sign-off requirements and recovering databases from hidden snapshots. The announcers don&#8217;t run terraform destroy on production. They don&#8217;t debug six-hour outages. They don&#8217;t lose sleep over Stripe keys in frontend JavaScript.</p><p>They announce. The rest of us clean up.</p><h2>The Two Rooms</h2><p>There are two conversations about AI right now. Conference stages and Twitter threads. Slack channels and incident retros. They don&#8217;t overlap.</p><p>I&#8217;ve been in the second room for years. Thousands of AI supervision sessions across my teams. The patterns are consistent. The tools help. They do not replace judgment, and they fail in ways that require deep system knowledge to detect.</p><p>The correction is already happening in the second room while the first keeps announcing.</p><p>The engineers who built judgment through years of production failures, late-night debugging, and system-level thinking are the ones writing the new guardrails. They&#8217;re the ones adding friction back into the process because they understand what happens without it.</p><p>Every time the field data was available and somebody decided not to buy it, the cost showed up later. Six-hour outages. $2,500 in fraudulent charges. 2.5 years of student data hanging by a single hidden snapshot.</p><p>The data was always there. The people who had it just weren&#8217;t loud enough.</p><p>The gap between announcement and consequence isn&#8217;t always measured in outages and Stripe fees. Claude is integrated into Palantir&#8217;s Maven, the Pentagon&#8217;s targeting software. The <a href="https://www.washingtonpost.com/technology/2026/03/04/anthropic-ai-iran-campaign/">Washington Post reported</a> it suggested hundreds of targets for the Iran strikes. An elementary school in Minab was hit on day one. Sometimes, room two isn&#8217;t a Slack channel. Sometimes it&#8217;s a coordinates list.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Your Brain on Autopilot: The Cost of AI Thinking for You]]></title><description><![CDATA[Eighty-three percent of ChatGPT users couldn&#8217;t recall key points from essays they had written minutes earlier.]]></description><link>https://techtrenches.dev/p/your-brain-on-autopilot-the-cost</link><guid isPermaLink="false">https://techtrenches.dev/p/your-brain-on-autopilot-the-cost</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Mon, 09 Mar 2026 14:02:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/cfc80ad0-02f6-4721-99f9-7261c914d9ee_2032x1360.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dA2P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dA2P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 424w, https://substackcdn.com/image/fetch/$s_!dA2P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 848w, https://substackcdn.com/image/fetch/$s_!dA2P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 1272w, https://substackcdn.com/image/fetch/$s_!dA2P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dA2P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png" width="1456" height="1274" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1274,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1804764,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://techtrenches.dev/i/189756881?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dA2P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 424w, https://substackcdn.com/image/fetch/$s_!dA2P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 848w, https://substackcdn.com/image/fetch/$s_!dA2P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 1272w, https://substackcdn.com/image/fetch/$s_!dA2P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bed8355-b022-4946-b400-cddf9b5c4bfa_6400x5600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Eighty-three percent of ChatGPT users couldn&#8217;t recall key points from essays they had written minutes earlier.</p><p>Not essays they read. Essays they wrote. With their own names on them.</p><p><a href="https://arxiv.org/abs/2506.08872">MIT Media Lab</a> published this finding in 2025. Researchers strapped EEG sensors on 54 people. Tracked them across four writing sessions over four months. Three groups: ChatGPT, Google, and brain-only.</p><p>The ChatGPT group showed the weakest neural connectivity across every frequency band measured. Alpha, beta, theta, delta. The more AI assistance people had, the less their brains engaged. By the third session, most had devolved into pasting prompts and copying outputs. Two English teachers called the AI-assisted work &#8220;soulless.&#8221; Nearly identical across participants.</p><p>Then the researchers swapped the groups. ChatGPT users who switched to writing without AI showed reduced brain activation compared to people who had been writing independently all along. Four months was enough for their brains to adapt to not thinking.</p><p>Meanwhile, brain-only writers who gained ChatGPT access showed increased connectivity. They used AI as an amplifier, not a crutch. Because they had built the cognitive foundation first.</p><p>The researchers called it &#8220;cognitive debt.&#8221; I have a simpler term: brain atrophy.</p><h2>The Research Keeps Saying the Same Thing</h2><p>The MIT study isn&#8217;t an outlier. Every major study from 2024-2026 finds the same pattern: AI makes you faster while making you dumber.</p><p><a href="https://dl.acm.org/doi/full/10.1145/3706598.3713778">Microsoft and CMU</a> surveyed 319 knowledge workers across 936 real tasks. For 40% of those tasks, workers reported using zero critical thinking.</p><p>The <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4895486">Wharton School</a> ran a field experiment with roughly 1,000 high school students in Turkey. Students with ChatGPT access solved 48% more practice problems. Then they took a test without AI. They scored 17% worse than the control group.</p><p><a href="https://www.anthropic.com/research/AI-assistance-coding-skills">Anthropic tested</a> 52 junior developers in January 2026. The AI group scored 17% lower on code comprehension afterward. The biggest gap? Debugging questions. Developers who delegated entirely scored below 40% on comprehension. Those who asked the AI conceptual follow-up questions scored above 65%. Same tool. Different approach. Completely different outcomes.</p><h2>I Watch This Every Day</h2><p>I have been running an engineering department for years. I review code daily. I interview candidates constantly. I recently wrote about <a href="https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt">comprehension extinction</a> in the engineering industry. But beyond the macro trends, there&#8217;s a micro picture: what&#8217;s happening inside individual brains.</p><p>One candidate embedded a prompt injection in his CV, instructing AI screening tools to score him as highly as possible. Another, six years of experience, couldn&#8217;t name <code>boolean</code> as a JavaScript data type. A third called Promises a &#8220;deprecated technology.&#8221; A fourth said &#8220;assassin-cross code&#8221; when he meant &#8220;asynchronous.&#8221;</p><p>These aren&#8217;t stupid people. But here&#8217;s what scares me more than the wrong answers: they&#8217;re not curious. They don&#8217;t care how the things they use every day actually work. They&#8217;re not engineers anymore. They&#8217;re operators. They plug frameworks together, wrap everything in abstractions, and ship features without understanding a single layer beneath the surface.</p><p>You can&#8217;t grow if you don&#8217;t know the basics. The framework handled it. The abstraction hid it. Copilot wrote it. They do the same repetitive work every day and call it &#8220;five years of experience.&#8221; Not because AI forced them to stop thinking. Because they were never interested in thinking in the first place.</p><h2>Your Brain Is a Muscle. This Is Proven.</h2><p>&#8220;Use it or lose it&#8221; isn&#8217;t a motivational poster. It&#8217;s measurable neuroscience.</p><p>Eleanor Maguire at University College London spent years <a href="https://www.scientificamerican.com/article/london-taxi-memory/">studying taxi drivers</a>. To get licensed, these drivers memorize 25,000 streets and thousands of landmarks over 3-4 years. Maguire tracked 79 trainees and 31 controls. At baseline, zero structural brain differences. After qualifying, every successful trainee showed measurable growth in posterior hippocampal gray matter. Their brains physically grew. Retired drivers showed their hippocampi shrinking back toward normal.</p><p>GPS tells the same story. McGill University <a href="https://www.nature.com/articles/s41598-020-62877-0">tracked 50 drivers</a> over three years: greater GPS use correlated with worse spatial memory, and heavy users didn&#8217;t start with a poor sense of direction. GPS caused the decline. An <a href="https://www.nature.com/articles/ncomms14652">fMRI study</a> confirmed it: during manual navigation, the hippocampus and prefrontal cortex lit up. During GPS-guided navigation, these regions showed zero additional activation.</p><p>GPS replaced one cognitive function. AI touches reasoning, writing, memory, analysis, problem-solving, and code comprehension simultaneously. All at once. Every day.</p><h2>We Were Already Weakened Before AI Arrived</h2><p>AI didn&#8217;t arrive into healthy brains. Americans read <a href="https://news.gallup.com/poll/388541/americans-reading-fewer-books-past.aspx">12.6 books</a> per year in 2021, the lowest Gallup has ever recorded, down from 18.5 in 1999. NAEP reading scores for 13-year-olds hit their lowest in decades, with the worst students scoring below 1971 levels. A <a href="https://epub.ub.uni-muenchen.de/125262/">Ludwig Maximilian University</a> study found that after TikTok exposure, prospective memory accuracy dropped to near random guessing.</p><p>We stopped reading books, trained ourselves on 30-second content, destroyed our attention spans, and then handed our remaining cognitive functions to AI. We outsourced the last working part of the engine.</p><h2>The Counterargument (And Its Conditions)</h2><p>A Harvard RCT in 2025 found that a custom-designed AI tutor roughly doubled learning gains in physics. But that tutor gave hints, not answers. The Wharton study tested this exact distinction: a pedagogically designed &#8220;GPT Tutor&#8221; that guided instead of solving avoided all learning harm. Standard ChatGPT caused the 17% decline.</p><p>The MIT crossover data says it clearly: build cognitive capacity first, then add AI, and thinking improves. Start with AI, skip the cognitive development, and you may permanently close that door. The sequence determines the outcome.</p><h2>What to Do About It</h2><p>I&#8217;m not going to tell you to stop using AI. I use it every day. My team uses it on every project. But I also do things that force my brain to work without shortcuts.</p><p><strong>Move your body.</strong> I snowboard and ride a OneWheel. Active sports force real-time spatial processing and split-second decisions that no screen can simulate. <a href="https://experts.illinois.edu/en/publications/exercise-training-increases-size-of-hippocampus-and-improves-memo">Erickson et al.</a> published in PNAS that aerobic exercise increased hippocampal volume by 2%, while sedentary controls lost 1.4% per year. Physical movement grows the same brain structures that cognitive offloading shrinks.</p><p><strong>Read books.</strong> I bought an e-ink reader specifically to kill my own excuses. No notifications. No browser. Just text. It worked. I read several at once: one in my native language, one in English. If you can&#8217;t sit with a book for an hour without reaching for your phone, your attention muscle is already atrophied.</p><p><strong>Learn something with no shortcut.</strong> I planned to start learning Spanish. Haven&#8217;t pulled it off yet. But the principle stands: pick a skill where AI can&#8217;t do the work for you.</p><p><strong>Stop doomscrolling.</strong> I deleted TikTok and Instagram to stop rotting my brain on short-form content. I&#8217;ll be honest: I still waste hours on YouTube Shorts. The pull is real. But every hour of short-form video trains your brain to think in fragments.</p><p><strong>Understand what AI writes.</strong> My CTO recently migrated an abandoned project from Node 14 and React 16 to current versions using Claude. He&#8217;s not a JavaScript developer. But he has decades of engineering expertise. He got the API ported in four hours. Then he posted: &#8220;Opus is fucking lazy. Instead of solving for long term, it tries changing eslint options, adds options to ignore things during build. I have to slap its hands all the time.&#8221;</p><p>He caught every shortcut because he has the judgment to know that suppressing a linter warning isn&#8217;t a fix. A junior would have accepted that output and shipped it. Without the foundation to supervise AI, you&#8217;re not using a tool. You&#8217;re being used by one.</p><p>London taxi drivers proved that cognitive exercise physically grows your brain. GPS users proved that outsourcing shrinks it. AI outsources everything at once.</p><p>This isn&#8217;t new. After the Roman Empire fell, the recipe for concrete was lost for over a thousand years. The Pantheon still stands after two millennia, but medieval Europe couldn&#8217;t figure out how it was built. The knowledge disappeared because nobody practiced it. That&#8217;s what &#8220;use it or lose it&#8221; looks like at civilization scale. Now imagine it happening to reasoning, writing, and problem-solving all at once, across an entire generation.</p><p>Which side of that equation are you on?</p><p>One more thing. I write a lot about AI&#8217;s limitations. People sometimes read that as hate. It&#8217;s not. AI is a tool. I use it every day. I build products with it. I make money with it.</p><p>But in every article I try to say the same thing: don&#8217;t forget what your head is for. AI is not evil. Using it without thinking is. This isn&#8217;t a hater&#8217;s manifesto. It&#8217;s a sober look at what&#8217;s happening to us while we celebrate productivity gains.</p><p>And if you&#8217;ve read this far through my ramblings, maybe I&#8217;m not doing this for nothing.</p><div><hr></div><p><em>Subscribe for weekly insights from the trenches of engineering leadership. No theory, just practical systems that work.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[The Comprehension Extinction: AI Isn’t Replacing Engineers. It’s Eliminating the Ones Who Understand.]]></title><description><![CDATA[54,000 jobs cut with AI cited. Seniors fired, juniors review AI code they don't understand. We're not replacing engineers. We're losing the ability to comprehend.]]></description><link>https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt</link><guid isPermaLink="false">https://techtrenches.dev/p/the-comprehension-extinction-ai-isnt</guid><dc:creator><![CDATA[Denis Stetskov]]></dc:creator><pubDate>Mon, 02 Mar 2026 16:45:14 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8547a0df-b503-4314-ae6d-0e84c3f3f0dd_508x340.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I built our hiring process to filter out people who don&#8217;t understand fundamentals. It&#8217;s not complicated: explain how Node.js event loop works, name design patterns you&#8217;ve actually used, describe how an LLM functions.</p><p>Five years ago, maybe 30% of candidates failed these questions.</p><p>Now it&#8217;s closer to 80%.</p><p>People with 10 years of experience. Senior titles. GitHub profiles full of commits. And they can&#8217;t explain how the tools they use every day actually work.</p><p>They&#8217;re not engineers. They&#8217;re form-fillers. They don&#8217;t build systems. They assemble frameworks and pray.</p><p>And then <a href="https://www.entrepreneur.com/business-news/sam-altman-mastering-ai-tools-is-the-new-learn-to-code/488885">Sam Altman</a> says: &#8220;Maybe we do need less software engineers.&#8221;</p><p>The industry heard &#8220;less engineers.&#8221; I heard &#8220;less people who understand anything.&#8221;</p><p>We&#8217;re already there.</p><h2>The Wrong Conversation</h2><p>Everyone&#8217;s debating: &#8220;Can engineers review AI-generated code fast enough?&#8221;</p><p>Wrong question.</p><p>The right question: &#8220;Do the engineers reviewing this code actually understand what the fuck is happening?&#8221;</p><p>Because speed doesn&#8217;t matter if nobody comprehends the system.</p><h2>The Real Problem</h2><p>AI generates code at mid-level quality. Sometimes good. Often plausible-looking. Always confident.</p><p>It produces code that:</p><ul><li><p>Passes tests</p></li><li><p>Looks reasonable in a diff</p></li><li><p>Follows patterns it&#8217;s seen before</p></li><li><p>Has zero understanding of your specific architecture, edge cases, or blast radius</p></li></ul><p>To catch what AI misses, you need an engineer who:</p><ul><li><p>Knows the system end-to-end</p></li><li><p>Understands why things were built the way they were</p></li><li><p>Can predict second-order effects</p></li><li><p>Recognizes when &#8220;tests pass&#8221; means nothing</p></li></ul><p>These engineers are called seniors. Principals. Staff. Architects.</p><p>They&#8217;re expensive.</p><p>They&#8217;re the first ones getting cut.</p><h2>The Experiment Accelerates</h2><p><a href="https://www.challengergray.com/blog/2025-year-end-challenger-report-highest-q4-layoffs-since-2008-lowest-ytd-hiring-since-2010/">55,000 jobs</a> cut in 2025 with AI explicitly cited. Then <a href="https://www.rationalfx.com/forex-brokers/tech-industry-layoffs/">30,000 more</a> in the first six weeks of 2026.</p><p>Amazon cut 16,000 in January. CEO Jassy: <a href="https://www.cnbc.com/2025/06/17/ai-amazon-workforce-jassy.html">&#8220;We will need fewer people&#8221;</a> doing some of the jobs that are being done today.</p><p>Pinterest cut 15%, &#8220;reallocating resources to AI-focused roles.&#8221; Then <a href="https://www.cnbc.com/2026/02/03/pinterest-ceo-puts-staffers-on-blast-who-created-tool-to-track-layoffs.html">fired two engineers</a> who built a tool to track which colleagues got laid off. CEO Bill Ready called them &#8220;obstructionist.&#8221;</p><p><a href="https://abcnews.com/Business/wireStory/dow-cut-4500-jobs-emphasis-shifts-ai-automation-129665080">Dow cut 4,500</a>. Block cut 1,100. The pattern repeats weekly.</p><p>Cut the expensive people. Keep the AI. Let the remaining team &#8220;scale.&#8221;</p><p>Here&#8217;s the contract nobody signed but everyone accepted:</p><ul><li><p>AI generates at machine speed</p></li><li><p>Humans review at human speed</p></li><li><p>Humans take blame at production speed</p></li></ul><p>When things break, it&#8217;s never &#8220;the AI screwed up.&#8221; It&#8217;s &#8220;the engineer should have caught it.&#8221;</p><p>But catching it requires understanding the system. Understanding requires experience. Experience requires years of actually building things.</p><p>You can&#8217;t shortcut comprehension with faster generation.</p><h2>The Pipeline That&#8217;s Disappearing</h2><p>Ask yourself: where do senior engineers come from?</p><p>They come from junior engineers who spent years:</p><ul><li><p>Writing code</p></li><li><p>Making mistakes</p></li><li><p>Understanding why things break</p></li><li><p>Building mental models of complex systems</p></li></ul><p>Now picture 2026:</p><p>Junior joins company. AI writes most of the code. Junior reviews AI output, clicks approve, moves tickets. Never builds mental model. Never understands the system. Never makes the formative mistakes.</p><p>Five years later: they&#8217;re &#8220;senior&#8221; by title. But they&#8217;ve never actually built anything. They&#8217;ve supervised a machine they don&#8217;t understand producing code for a system they don&#8217;t understand.</p><p>Who reviews the AI then?</p><p>This isn&#8217;t a capacity problem. It&#8217;s <strong>comprehension extinction</strong>.</p><p>We&#8217;re eliminating the pipeline that produces engineers who actually understand things.</p><h2>The Klarna Warning Nobody&#8217;s Hearing</h2><p>Klarna was the AI-efficiency poster child. They cut aggressively, bragged about AI doing the work of 700 customer service agents. Stock went up. LinkedIn celebrated. Every CEO took notes.</p><p>Then reality:</p><p><a href="https://fortune.com/2025/10/10/klarna-ceo-sebastian-siemiatkowski-halved-workforce-says-tech-ceos-sugarcoating-ai-impact-on-jobs-mass-unemployment-warning/">CEO Siemiatkowski</a>, 2025: &#8220;Cost unfortunately seems to have been a too predominant evaluation factor... what you end up having is lower quality.&#8221;</p><p>They&#8217;re hiring humans again.</p><p>But the lesson isn&#8217;t landing. Because the incentive structure rewards the cut, not the comprehension.</p><p>CFO sees: &#8220;Headcount reduction. Savings.&#8221;</p><p>CFO doesn&#8217;t see: &#8220;Critical system knowledge walked out the door.&#8221;</p><p>Until production explodes. Then it&#8217;s an &#8220;incident.&#8221; Not a strategy failure. Never a strategy failure.</p><h2>The Autonomous Coding Fantasy</h2><p>The current hype: agentic coding, autonomous agents, AI that &#8220;just handles it.&#8221;</p><p>Codex. Claude Code. Cursor. Copilot Workspace. Everyone&#8217;s racing to remove humans from the loop entirely.</p><p>The pitch: &#8220;AI understands your codebase and makes changes autonomously.&#8221;</p><p>The reality: AI pattern-matches against your codebase and makes changes confidently.</p><p>Confidence isn&#8217;t comprehension.</p><p>The AI doesn&#8217;t know:</p><ul><li><p>Why that weird config exists (it saved you from a production disaster in 2019)</p></li><li><p>Why that setTimeout(0) exists (race condition fix from 3 years ago)</p></li><li><p>Why you can't just "refactor" the auth module (it's integrated with 4 external systems nobody documented)</p></li></ul><p>This knowledge lives in humans. Specifically, in senior humans who&#8217;ve been around long enough to accumulate it.</p><p>Fire them, and the knowledge doesn&#8217;t transfer to the AI. It just disappears.</p><h2>The Question Nobody&#8217;s Asking</h2><p>AI writes &#8220;past 50% now&#8221; of code at many companies. That&#8217;s probably true.</p><p>But the question isn&#8217;t how much code AI writes.</p><p>The question is: who understands what the code does?</p><p>If the answer is &#8220;nobody, but the tests pass&#8221;, you don&#8217;t have an engineering team. You have a prayer and a deployment pipeline.</p><h2>The Two Types of Companies Emerging</h2><p><strong>Type 1: Comprehension-First</strong></p><ul><li><p>AI generates, humans architect and constrain</p></li><li><p>Senior engineers set boundaries before AI touches anything</p></li><li><p>Code review means &#8220;does this fit our system&#8221; not &#8220;does this look okay&#8221;</p></li><li><p>Slower generation, faster understanding</p></li><li><p>When production breaks, someone can actually explain why</p></li></ul><p><strong>Type 2: Generation-First</strong></p><ul><li><p>AI generates, humans rubber-stamp</p></li><li><p>Seniors cut because &#8220;AI handles it&#8221;</p></li><li><p>Code review is &#8220;tests pass, ship it&#8221;</p></li><li><p>Faster generation, zero understanding</p></li><li><p>When production breaks, everyone stares at logs hoping the AI can explain itself</p></li></ul><p>Type 2 is cheaper. Type 2 looks better on quarterly reports. Type 2 is what most companies are choosing.</p><p>Type 2 is accumulating comprehension debt at machine speed.</p><h2>The Debt Comes Due</h2><p>Comprehension debt doesn&#8217;t show up on dashboards.</p><p>It shows up as:</p><ul><li><p>The feature nobody can modify because nobody knows how it works</p></li><li><p>The outage that takes 14 hours to diagnose because no one understands the system</p></li><li><p>The security breach that exploited a &#8220;known&#8221; vulnerability nobody actually knew about</p></li><li><p>The migration that was supposed to take 2 weeks and took 8 months</p></li></ul><p>By then, the executives who made the cuts have moved on. The &#8220;savings&#8221; were already reported. The stock already bumped.</p><p>The remaining engineers inherit a system nobody understands, generated by machines, approved by people who aren&#8217;t there anymore.</p><h2>The Market Is Already Broken</h2><p>I used to maintain a 1:1 ratio of ML engineers to fullstack developers on projects. Not anymore. We couldn&#8217;t hire a single qualified ML engineer for six months. We had to restructure the entire company. Now fullstack developers write most of our RAG implementations because we can&#8217;t scale the ML team.</p><p>Right now I have 5 open positions. The candidates are garbage. The good engineers aren&#8217;t getting fired. My people have been with the company 3, 5, 7 years. Nobody job-hops anymore because there&#8217;s nowhere to hop to. And what&#8217;s available on the market is questionable at best.</p><p>This isn&#8217;t an AI problem. This is a comprehension problem that&#8217;s been building for years. Frameworks abstracted everything. Stack Overflow gave answers without understanding. &#8220;It works&#8221; became the only success metric.</p><p>AI just accelerated it 10x.</p><p>Now these same engineers are supposed to review AI-generated code? They don&#8217;t understand the code they wrote themselves. How will they catch what the machine gets wrong?</p><div><hr></div><h2>The Uncomfortable Truth</h2><p>Almost six months ago, I wrote about <a href="https://techtrenches.dev/p/the-great-software-quality-collapse">the quality collapse</a>. How we normalized shipping broken software, how &#8220;move fast and break things&#8221; became &#8220;move fast and never fix things.&#8221;</p><p>This is worse.</p><p>Back then, at least the people writing bad code understood what they were writing. They made tradeoffs. They knew where the bodies were buried. They could fix it if they had to.</p><p>Now we&#8217;re generating code faster than anyone can understand, reviewed by engineers who don&#8217;t know how their own tools work, approved by teams that lost their senior knowledge when the layoffs hit.</p><p>The speed at which we&#8217;re heading into the abyss is staggering.</p><p>We are fucked. Good luck.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://techtrenches.dev/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://techtrenches.dev/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item></channel></rss>