4 Comments
User's avatar
Denise Heap (private)'s avatar

You are the technical side that the AI discussion has long needed.

Francis Turner's avatar

You do wonder what proportion of Claude code customers are successes rather than failures. And how much of the successes are far more limited than what is reported.

Interestingly I went to look at Antrhopic's success stories for Claude Code and Spotify appears absent - https://claude.com/customers?fcdaa149_sort_date=desc&fcdaa149_1_product_equal=%5B%22Claude+Code%22%5D

Given that they have been mentionging spotify for several months now that seems odd.

Denis Stetskov's avatar

Claude Code is #52 on Terminal Bench at 58% accuracy. Every agent above it runs on the same model with infrastructure on top. Honk is the same pattern: Claude Code + 15 years of Fleet Management. The tool alone isn't the story. https://www.tbench.ai/leaderboard/terminal-bench/2.0?models=Claude+Opus+4.6

Abcdefg's avatar

> You start digging, and there are no processes. Half the knowledge lives in somebody’s head, and that person is the only one who knows how any of it works.

And as the industry is desperate to shed headcount who in their right mind would write it down?

Hence the dystopian levels of monitoring at places like Meta.