22 Comments
User's avatar
Jim North's avatar

Setting aside for a moment my opinion that coding agents are turning software development into one big cluster fuck; what I have yet to hear anyone discuss with any seriousness, is what the plan is for after all this code that no one has looked at, is put into production. Inevitably, code will do what code does, and someday will break; or requirements change and will need to be modified; or underlying infrastructure changes, requiring it to be refactored. Is the plan to spin up a few dozen token-guzzling agents to "pull the lever" on the slot machine until, by random chance, a working "fix" (that no one will look at, nor understand) falls out?

This is not the next evolution in software development, like a new compiler, or a new language -- both of which are deterministic tools that are written by and understood by humans. These are probabilistic "black boxes" being used to replace the knowledge, understanding and coding skills of a human being.

This is a devolution, not an evolution.

Denis Stetskov's avatar

The lever image is exact. And the reason it spreads: validating output is expensive thinking, pulling the lever is cheap. People take the cheap path by default, so validation is the first habit dropped. The black box does not replace understanding by force. It replaces it by being easier. That is what nobody is naming.

Ibon Urrutia's avatar

"End of an era for regular developers" is what absolutely worries me. Is now banned by law for developers to develop WITHOUT code agents? Are we always competing with the rest of developers in a race of LOC and if you don't release 10000 per day you lose? Is impossible to do open source just for the pleasure/pain of doing it without agents?

I think that we bought not only the productivity narrative. We brought also what I called the bully narrative: a guy without code agents can not "compite" with a guy with code agents. They converted the software development in a zero sum game in which the first one, takes it all.

That is what I don't understand how the people is accepting so fast. Google was not the first web search. And when it appeared it did not follow any of the supposed mainstream patterns of yahoo, astalavista... Etc. Even if your product is just a clone, don't have to be the first clone to be successful or die.

Yea guys, you can write from scratch faster than any developer. It is a machine, of course that does it faster. But remember the Fable of the tortoise and the hare. The fastest don't win the race if it starts playing around without direction.

I get the boost of endorphins you are getting using agents, but I can not understand in professional software engineers that stop coding when they do not have an agent. Makes me wonder if they are the ones that should elect a different profession instead to tell the not looping ones that they are left behind.

They are winning battles that never started in the first place.

Denis Stetskov's avatar

"Bully narrative" is exactly right. But watch the sleight of hand: they fold two claims into one.

Using AI and validating every step is a tool, and yes, the market already picks the engineer who does over the one who refuses. That race is over.

Using AI and not reading the output is what these two are actually selling. That is not the same thing, and that is the one that breaks. The grift is letting you think they are the same race.

Ibon Urrutia's avatar

The thing is that the pro and anti AI stupid flame war, has simplified the positions. Black or white.

In my case I decided NOT TO PAY to any of those startups to do what I know how to do, to write code. My motives were personal and derived from my own moral. But if a company wants me to use agents and pays the bill of those 2 startups, of course that I am going to do what my employer asks. Even, I am going to try to reduce the money that those 2 companies receive because my personal moral. So I'm benefitting reducing costs of my employer. I'm a ronin, not a samurai playing the flute in the mountain.

Why do I not use then a local model? Well, the last 6 months taught us that we have a new "engineering" every 3 weeks. Following the chaotic trend and advice of the hype is energy and time consuming. I preferred refining my writing and reading ones, and started learning other languages I don't have so much experience with to extend my skills in a way that is useful, with and without agents. In the past, I realized that prompt engineering was BS after 6 tests when I saw my first sigma event. So I was LOLing when that was the trend in LinkedIn. Not running like an haze.

The proponents of don't read the code want a world of developers not able to read the code, I might say that they start to be scared of people that can read the code. Because of course that they are better and more useful for any company.

I do not get the narrative that a guy able to write and read the code in certain language, is totally impossible that can write prompts and check the results. Guys, even looping over a non deterministic source using deterministic checks to see if the solution is converging or diverging is called Monte Carlo and is as old as computers. But while you waste interminable hours writing the perfect loop and feeling frustrated for not being a good "loop engineer" , I know that there is not any mathematical way to guarantee convergence of a problem, and I will change just the input a little bit until having something or just don't use monte Carlo for that problem. See? I already have loop engineering skills 😁

It is not so hard to learn how to use a tool when you master what you do with the tool. Yeah, some companies will force you the tool, but I'm certain that in some point, not all companies will force it. Because again, running fast 100m is useless if you have to go back 500m after 🤷 And eah developwr will find the flow that adjusts better to him. I saw an interview with the creator of Pi code and he found his own personal productive workflow. And it is not vibe coding all. Or reviewing when he vibe codes. He elect when is vibe coding, tasks that he don't mind that have horrendous code on them, export to HTML for example. And refactors by hand poc vibe coded.

The future is not a zero sum game neither. We have to fight the lies of the present, and I don't have to make mine the narrative of the liar or just say the opposite. It is ridiculous. There are many opinions between the black and white. I think that looping only benefits to the token sellers, because they forgot to mention that they are proposing a monte Carlo. Loop engineering sounds cool for your byline in LinkedIn and makes you waste tokens as hell.

Denis Stetskov's avatar

Nobody is keeping us around because we are good. We are here because the machine cannot do it cheaper yet. That is the only reason.

The day it ships the same thing for less, crooked or not, we are gone, and the market will not pause to admire our clean code. No sentiment in it.

So I do not buy either story. Not theirs, that autonomy is already here. Not the hopeful one, that judgment protects us forever. We are kept by a cost gap, nothing else, and it lasts exactly as long as the numbers say.

Ibon Urrutia's avatar

That is the thing. At the moment, the proponents of crooked code say that is possible. In my experience a big ball of mud is a big ball of mud for humans and for code agents.

When you reach that terminal velocity of combinatorial explosion of states that you touch some part and suddenly another totally unrelated part of the code explodes, the best option is escape from project. Not refactoring. Run.

I do not know if some crooked projects have reached that limit yet. But I'm sure that after that limit is reached, the agent pass more time fixing the bugs it introduces with every change than doing anything new. The fix that introduces a bug that you fix that changes something that you have to change....

A lucrative eternal loop of the agents. I can not probe it mathematically. But I saw certain projects in the past that were exactly in that loop. I think that is inherent to crooked code. For agents and for humans.

Time will tell.

Francis Turner's avatar

I expect AI generated code to be more susceptible to this than human generated code. AI doesn't have a way to say "this is a heap of crap, we need to start from scratch and redo" so it will try and fix stuff and break other stuff faster

Ibon Urrutia's avatar

Sometimes I have doubts that any of these guys, proposing that all is valid and just let the agent do, has really passed time in different trenches. I was unlucky enough that I worked in many different companies and projects, many of them failing. Let's say that I'm dirty Harry, doing the stuff that nobody wanted to do. And sometimes, there was no economical solution to the problem.

And not all good practices are a question of engineering snobism. Many of them are for saving money to your employer in the long run so you never reach that state of no solution.

The danger is that it looks these days that nobody is looking in the long run. But many times if not everytime, savings in the short run snowballed in giant costs in the long run. Ask Boeing 😉

Ibon Urrutia's avatar

I imagine sometimes the guy in fetal position, 4 a.m in the morning. 4 hours of a major outage in production. And the agent is not able to find the bug because, hey, is hallucinating for that specific combination of tokens, and nothing is making it to stop halucinating.

And the guy knows that now he is going to need to read the code 😂

Nick Ruisi's avatar

I’ve been experimenting with what’s possible running local (laptop) inference with small (2b) Gemma models. I was trying to take the network completely out of the solution but I may need to concede that point because the HTTP- based APIs all have some really

Good prompt management code in them that I don’t want to reinvent

I’ve been pair programming with copilot and Gemini on this project. I’m starting to get disappointed in them.

Denis Stetskov's avatar

Yeap, you either pay it in tokens or pay it in the months you spend reinventing the harness. The disappointment you are hitting with Copilot and Gemini is the same thing from the other side: the tool is fine until you ask it to hold the context you would have held yourself.

ToxSec's avatar

"It is worth asking what the thing actually is, this loop expensive enough that no one will meter it and no one who praises it pays for it."

great call out here. you found the tell everyone missed: the guys preaching "walk away and let the loop run" are the only ones not paying for the pulls.

Mikael Hanna's avatar

After the loop ends, and your bank account is torched, you get a new set of asbestos + pasta injected into your code base. I don’t take Cherny or Steinberger seriously. I don’t believe their assertions that they never touch any code. They are partly PR people, and to put it bluntly, I just think they are lying. I believe it’s true they mostly don’t do any dirty work, and would want to avoid entirely. But when they say they never read or edit code, does that mean they let someone else in the chain read and clean up the mess? If so, they are still lying, even if they themselves do not do any review directly. They just omitted that someone does. They may not code any longer, but for sure, someone, at this stage, must tap into the code. Their direction is clearly directed towards entities with budgets a mid sized company just doesn’t have. Unless energy becomes free, which it never will, what’s the end game? Even if energy and infrastructure costs go down towards zero, Steinberger will make another post, about loops, with infinite loops, that will ramp up the bill to infinity, even though costs of running the agent is near zero.

Denis Stetskov's avatar

Agreed with each word.

Craig's avatar

Imagine if your water company said you could save money on electricity by putting a hydroelectric generator in your bathtub--all you need to do is run up the water bill, and you'll save so much on your electric bill!

Denis Stetskov's avatar

The whole pitch sells you more of the water you have to run to power the thing.

Milton Soong's avatar

I have two non professional project that my Claude pro account mostly covers. Occasionally it bumps over the weekly limit and I then decide to pay or not depending on what I am doing.

I designed a loop as well to make my life easier, but it’s not launched mainly because there is no capability in the loop (afaik) to “continue if I have > x% of my usage left, else send me a discord’s message and stop). If I am a cynic I would even say they deliberately NOT provide that capability for a reason.

Denis Stetskov's avatar

Yeah, and that missing stop is the whole problem. "Keep going if I have more than X percent left, else ping me and stop" is a few lines of client-side code. The weird part is the vendor never gives you that out of the box. I won't say they do it on purpose, but it reads strange that the one knob that caps your spend is the one nobody ships.