I think AI rescue consulting is going to be come a significant mode of high value consulting, similar to specialists who come in to try and deal with a security breach or do data recovery.
Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.
Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).
> Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable.
Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)
Humans who have been writing systems like that for many years know how to maintain and modify them successfully. It’s just that our industry has a bias towards youth who don’t think they have anything to learn from those who came before them.
My team lead has worked on the same software for 30 years. He has the ability to hear me discuss a bug I noticed, and then pinpoint not only the likely culprit, but the exact function that's causing it.
I do the same thing in a project I’ve worked on for 25 years. I’ve had mediocre at best results with AI. It’s useful to discuss concepts with, but the code never handles the nuances of the edge cases.
How do you explain to a junior this pile of messy code isn’t crap but is actually years of integrated knowledge ? That the most common principles discussed in computer science (OOP, SOLID, DRY etc.) are actually just little guides that aren’t to be taken to the extremes ?
A decade ago, I was sitting in on a meeting about a rewrite and, before I could say anything, someone in the first year of her career asked why anyone thought a rewrite would be any cleaner once all the edge cases were handled. Afterwards, I asked her where she learned this. She said "I don't know, it just seems kind of obvious." She went on to be a great engineer and is now a great manager.
The origin of 'dark DNA' begins to make more sense through this sort of lens, except the system somehow maintained a level of compensation to fix all its flaws.
A non-technical friend of mine has just won some hospital contracts after vibecoding w/ Claude an inventory management solution for them. They gave him access to IT dept servers and he called me extremely lost on how to deploy (cant connect Claude to them) and also frustrated because the app has some sort of interesting data/state issues.
What concerns me about this is that as these stories multiply and circulate people will just completely stop buying software/SAAS from startups, because 90% or more will be this same thing. It will completely kill the market.
Those are custom software or heavily customized implementations of ERP and similar systems for very large organizations. I’m talking more about the SMB market where today it’s possible for a small team to carve out a niche and make a nice living or even bootstrap a venture that competes with a large player that has poor UX or antiquated feature designs.
The reason Oracle can continue failing at those massive projects is simple: everyone fails at them routinely and often it’s the customers fault.
I used to gripe about various ERP companies but after having dealt with enough, yeah, that's just what the world of ERP systems is like. You will spend your time even with the best of them desiring to scream endlessly at everyone who works there. And they also know your pain but are powerless to help.
But the Torment Nexus is such an interesting technical challenge! and I don’t personally torment people: I just move protobufs around! - Software Engineer #1 and #2 excuses
Or you end up with a certification process, which will of course introduce it's own problems but startups doing things the right way and not just "moveing fast and breaking things" can thrive.
As a cybersecurity IR professional as much as I hate to see this happen to a hospital this kind of thing is responsible for essentially tripling my income over the last 3 years.
This hospital will learn some hard lessons. I hope their backup strategy is good. I'm surprised they can field software from an entity that isn't SOC2 & HIPAA certified.
No worries! At worst, the contractor can just tell Claude to make sure the hospital knows they're appropriately certified. And the hospital can use Claude to make sure the certs are valid. Everybody wins, except the ones who end up dead. Or with their health destroyed.
This is going to happen all over. Company I'm currently contracting with has gone AI everything (aka technical debt hell), and they're gonna suffer for it. I'm glad my consulting contract ends in 2 months. I don't want to be around for the crash
Heh. Got a customer recently around this. Entire infrastructure and CI/CD vibecoded. They half implemented Kubernetes in Github Actions that were several thousand lines long and impossible to understand.
I think the problem will get worst. I dislike the marketing around AI, but I do think it is a useful tool to help those who have experience move faster. If you are not an expert, AI seems to create a complex solution to whatever it is you were trying to do.
> If you are not an expert, AI seems to create a complex solution to whatever it is you were trying to do.
I've been watching non-developers vibe code stuff, and the general failure mode seems to be ignorance of 3-pick-2 tradeoffs.
They'll spam "make it more reliable" or some such, and AI will best-effort add more intermediary redis caches or similar patterns.
But because the vibe coders don't actually know what a redis cache is or how it works, they'll never make the architectural trade-offs to truly fix things.
I’ve noticed something similar with vibecoded game rendering logic submitted by peers. Sometimes it will be peppered with extraneous checks for nullptr, or early returns on textures that have zero size.
I often wonder if it’s the statistical nature of the LLM mixed with a request in the prompt.
Shelve it with the Jurassic Park version where John Hammond builds a safe, profitable theme park, and The Andromeda Strain that gives people the sniffles.
This might not pan out to be the glorious victory of human craft as you’re imagining it to be.
Here’s a slightly different future - these AI rescue consultants are bots too, just trained for this purpose.
Plausible?
I have already experienced claude 4.7 handle pretty complex refactors without issues. Scale and correctness aren’t even 1% of the issue it was last year. You just have to get the high level design right, or explicitly ask it critique your design before building it.
> Maybe in the future but certainly no evidence of this anytime soon
Here's some anecdotal evidence from me - I cleaned up multiple GPT 4.x era vibecoded projects recently with the latest claude model and integrated one of those into a fairly large open source codebase.
This is something AI completely failed at last year.
Maybe you should try something like this or listen to success stories before claiming 'certainly no evidence' in future?
I'm no expert, but the skeptic's opinion I've heard would be to ask:
What evidence is there that we're not at or close to a plateau of what LLMs are capable of? How do you know the growth rate from 2023 to present will continue into 2029? eg. Is it more training data? More GPUs? What if we're kind of reaching the limits of those things already?
Ultimately, you are describing a fundamental problem with induction -- Hume's problem of induction to be specific. How can we know that anything that has been shown empirically in the past will continue to be true - we can't. Best to investigate mechanistically:
I don't see why we would assume that we are at a plateau for RL. In many other settings, Go for instance, RL continues to scale until you reach compute limits. Some things are more easily RL'd than others, but ultimately this largely unlocks data. We are not yet compute/energy/physical world constrained. I think you would start observing clear changes in the world around you before that becomes a true bottleneck. Regardless, currently the vast majority of compute is used for inference not training so the compute overhang is large.
Assuming that we plateau at {insert current moment} seems wishful and I've already had this conversation any number of times on this exact forum at every level of capability [3.5, 4, o1, o3, 4.6/5.5, mythos] from Nov 2022 onwards.
Since we're not experts, we treat it as a black box. What are the results? Is the quality of the results improving? Is the improvement accelerating or decelerating?
And the answer appears to be that the improvement is accelerating. So how could it be stopping?
I have personally had success telling Claude that some AI-written system is too complicated and ask it to rewrite it in a more logical way. This sometimes results in thousands of lines of code being deleted. I give an instruction like that if I see certain red flags, eg:
1) same business logic implemented in two different places, with extra code to sync between them
2) fixing apparently simple bugs results in lots of new code being written
It’s a sign I need to at least temporarily dedicate more effort to overseeing work in that area.
I somewhat agree with the AI psychosis framing of the OP. It takes some taste and discipline to avoid letting things dissolve into complete slop.
I've already done a handful of these gigs for early vibecoded products that had collapsed in on themselves. The scope of work was to stabilize the product and only make existing features work.
The issues have all been structural, not local. It's easier to treat it like a rewrite using the original as a super detailed product spec. Working on the existing codebase works, but you have to aggressively modularize everything anyway to untangle it rather than attack it from the top down.
All of these projects have gone well, but I haven't run into a case where a feature they thought was implemented isn't possible. That will happen eventually.
It's honestly good, quick work as a contractor. But I do hope they invest in building expertise from that point rather than treating it like a stable base to continue vibecoding on.
I'm with you on this one, having "vibe coded" some smaller internal tools on GPT 5, and then re-vibed it on Opus 4.6 and 5.5 -- they basically just fixed all of the problems without me doing much of anything other than prompting it to look at the existing code and make it "better".
Are you sure about this? Yes, there is a stable set, but they are used in all of the wrong places, particularly in places where they don't belong because juniors and now AIs can recite them and want to use them everywhere. That's not even discussing whether the stable set itself is correct or not - it's dubious at this point.
What you're describing really isn't a new problem for organizations. Historically it's been a team of humans not using AI who gets over their skis and they have to have other more capable humans (also not using AI) to bail them out.
Those design principles it will take us 20 years to learn are just the principles for writing good maintainable, debug-able, understandable code today. Will just take 20 years to figure out they still apply when AI writes the code, too.
Someone responded to a previous comment of mine [0] positing a Peter principle [1] of slopcoding — it will always be easier to tack on a new feature than to understand a whole system and clean it up. The equilibrium will remain at the point of near, but not total, codebase incomprehensibility.
Frankly this is what everyone is counting on whether they know it or not. The question though is not “will the models get good enough?”. The question is does the repo even contain enough accurate information content to determine what the system is even supposed to be doing.
People are often skeptical when I say this, but there's simply no guarantee that it's possible in principle to clean up a bad architecture. If your system is "overfitted" to 10,000 requirements from 1,000 customers, it may be impossible to satisfy requirements 10,001 through 10,100 without starting over from scratch.
It's really not that big of a word. The CAP theorem shows that as few as three reasonable-sounding requirements with no obvious conflicts can be impossible to satisfy simultaneously. (User needs will start more flexible than strict mathematical requirements, of course, but once people start to build production workloads on top of your systems that flexibility is radically reduced.)
The complexity you would come to the rescue to solve, would that be from AI or from the style of programming you let the AI have? I mean, you have very different problems if you use functional style vs object-oriented. It is up to the programmer to realize they want a functional style and request that from the AI, as much as possible. Even AI cannot imagine every state transition, unless it is so smart that it should be the one telling you what to do.
Interesting perspective. Fundamentally at conflict with the data, science, and 20+ year trends of AI coding systems - to the point of dogmatism. But interesting from a sociological point of view.
> I think AI rescue consulting is going to be come a significant mode of high value consulting
I thought the same when I saw development outsourced to Indians that struggled to write a for loop.
I was wrong.
It turns out that customers will keep doubling down on mistakes until they’re out of funds, and then they’ll hire the cheapest consultants they can find to fix the mess with whatever spare change they can find under the couch cushions.
Source: being called in with a one week time budget to fix a mess built up over years and millions of dollars.
is this true because training companies have not been training AI for both performance and brevity (or some other metric like that)? If this becomes a much more serious issue surely they would adjust the training processes
> Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first...
It's really nowhere near as complicated as making distributed systems reliable. It's really quite simple: read a fucking book.
Well, actually read a lot of books. And write a lot of software. And read a lot of software. And do your goddamn job, engineer. Be honest about what you know, what you know you don't know, and what you urgently need to find out next.
There is no magic. Hard work is hard. If you don't like it get the fuck out of this profession and find a different one to ruin.
We all need to get a hell of a lot more hostile and unwelcoming towards these lazy assholes.
It's kind of like producing code is becoming more like farming.
We didn't create the dna we rely on to produce food and lumber, we just set up the conditions and hope the process produces something we want instead of deleting all the bannannas.
Farming is a fine an honorable and valuable function for society, but I have no interest in being a farmer. I build things, I don't plant seeds and pray to the gods and hope they grow into something I want.
Prayers are for weather. Pretty much all farmed plant, animal, and fungus species have been selectively bred or genetically modified. Farmers know what's going to grow.
Farming has merely a lot of study and input into the process, very little actual control and no determinism at all. We know how to improve chances is all. The fact that we breed and "engineer" is like a drop in the bucket.
Tell me you've never done any farming without telling me you've never done any farming. There is certainly risk in the business due to market fluctuations, weather, natural disasters, disease, and pests. But the final product is highly deterministic. Almost all genetic variability has been expunged from major food production species in a relentless pursuit of predictable yield. Everything looks and tastes the same. We can debate whether that's a good thing but it is the reality for most farmers.
You might grow corn, or you might grow defective unusable corn and/or any number of other things like locusts or fungi or other plants that decide to grow in the place where you planted corn. Sure, the corn seeds will not produce ball bearings. Genius observation. There are about an infinity of other things that can and do happen besides that.
Planting is merely setting up the conditions. We didn't write the dna, we couldn't write the dna if we wanted to because we are an infinity away from understanding all the actual processes that descend from the dna. And when we utilize the dna that we simply found and didn't and couln't hope to write, it's always, at best, a case of hoping it goes right again this time.
I'm pretty sure he's talking about companies and people outsourcing their decision making and thinking to AI and not really about using AI itself.
I don't think using AI to write code is AI psychosis or bad at all, but if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter. They literally post screenshots of ChatGPT as their thinking and reasoning about the topic instead of just doing a little bit of thinking themselves.
These things are dog shit when it comes to ideas, thinking, or providing advice because they are pattern matchers they are just going to give you the pattern they see. Most people see this if you just try to talk to it about an idea. They often just spit out the most generic dog shit.
This however it pretty useful for certain tasks were pattern matching is actually beneficial like writing code, but again you just can't let it do the thinking and decision making.
Correct. I use AI a ton and I'm having more fun every day than I ever did before thanks to it (on average, highs are higher, lows are lower). Your characterization is all very accurate. Thank you.
I thinking that it’s quite a different experience going all Jackson Pollock with AI in your own studio on your own terms, compared to the sorry state of affairs of having 100s of Pollocks throwing paint around wildly within a corp to meet a paint quota.
Hi Mitchell. Psychosis is a serious psychiatric condition that can be induced or triggered by AI. “AI psychosis” in this context is a misuse of a clinical term. Your tweet describes a disagreement on a value judgment that boils down to “move fast and break things” with high trust in AI outputs vs going all in on quality and reliability with low trust in AI. It’s an engineering tradeoff like any other.
Claiming that the people who disagree with you must be experiencing a form of psychosis, experiencing actual hallucinations and unable to tell what is real, is a weak ad hominem that comes off no better than calling them retarded or schizophrenic.
If you genuinely think one of your friends is going through a psychotic episode, you should be trying to get to them professional help. But don’t assume you can diagnose a human psyche just because you can diagnose a software bug.
He uses "AI psychosis" as a description of people that are overzealous on AI. He is obviously not a person that can or would diagnose mental illness.
To the wider audience on HN the phrasing is pretty clear. An outsider with a tiny bit or intellectual charity wouldn't come to conclusions like you do.
Yeah, but AI psychosis can also be used to mean the stronger thing that the parent comment refers to -- something like AI-induced psychosis, which was how I originally understood the term:
was looking for this comment. this post is highly inappropriate and very inaccurate. this should be at the top. too many people are throwing around the word psychosis without knowing what it means. if someone is truely going through psychosis you get them help!
Psychosis does not require hallucinations. Delusions are sufficient.
The key factor is losing touch with reality, which results in individual or collective harm.
There is also such a thing as mass psychosis, and those are unfortunately a more difficult situation because the government and corporations are generally the ones driving them, and they are culturally normalized.
Yes. I was offering examples. Again, having a difference of opinion is not a delusion.
If he meant mass psychosis, he should have said mass psychosis. And again, since he is not a public health scientist or any flavor of psych professional, he probably shouldn’t make those proclamations. And should probably call for a wellness check instead of posting on social media if he were truly concerned for their health.
I don't think this is all psychosis but more like extreme groupthink.
For people who are considered neurotypical, social coherence often overwrites reality. Its a mechanism for achieving consensus withing groups while spending the least amount of brain compute energy. Same goes for social metainfo tagged messages, they are more likely to influence reality perception, subconsciously. E.G: If a rich guy says you should be hyped the people who wanna get rich will feel hyped and emotional contagion can spread between people who belong to the same "tribe"
It's very visible for us atypical folk who can't participate well in groupthink at all
The way I put this to myself is that AI gives “correct correct answers and incorrect correct answers”.
They almost always generate logically correct text, but sometimes that text has a set of incorrect implicit assumptions and decisions that may not be valid for the use case.
Generating a correct correct solution requires proper definition of the problem, which is arguably more challenging than creating the solution.
It’s simpler than that - it’s a guessing machine that has superior access to a whole load of information and capacity to process at a speed at which we humans cannot compete.
Does it make it better than us? No because ultimately the thing itself doesn’t ‘know’ right from wrong.
Yeah, very often the issue is that some context is missing. It'll say something true, but which misses the bigger point, or leads to a suboptimal result. Or it interprets an ambiguous thing in one specific way, when the other meaning makes more sense. You have to keep your wits about you to catch these things.
It's an incredible tool but it's also very derpy sometimes, full of biases, blind spots etc.
when you outsource thinking to AI, you get that magical speed up. the agent is making decisions for you, so things move at agent speed. it often makes decisions without telling you, and the final "here's the plan" output often requires you to understand the problem at great depth, which requires return to human speed, so you skim and just approve.
the trick is to be mindful, aware, and deliberate about what decisions are being outsourced. this requires slowing down, losing that absurd 10x vibe coding gain. in exchange, youre more "in-the-loop" and accumulate less cognitive debt.
find ways to let the agent make the boring decisions, like how to loop over some array, or how to adapt the output of one call into the input of another.
make the real decisions ahead of time. encode them into specs. define boundaries, apis, key data structures. identify systems and responsibilities. explicitly enumerate error handling. set hard constraints around security and PII.
tell the agent to halt on ambiguity.
a good engineer will get a 2x or 3x speedup without the downsides.
> find ways to let the agent make the boring decisions, like how to loop over some array, or how to adapt the output of one call into the input of another.
Those kind of advice ultimately don't matter. If you're familiar with a programming project, you'll also be familiar with the constructs and API so looping over an array or mapping some data is obvious. Just like you needn't read to a dictionary to write "Thank you", you just write it.
And if you're not, ultimately you need to verify the doc for the contract of some function or the lifecycle of some object to have any guaranty that the software will do what you want to do. And after a few day of doing that, you'll then be familiar with the constructs.
> make the real decisions ahead of time. encode them into specs. define boundaries, apis, key data structures. identify systems and responsibilities. explicitly enumerate error handling. set hard constraints around security and PII.
The only way to do that is if you have implemented the algorithm before and now are redoing for some reason (instead of using the previous project). If you compare nice specs like the ietf RFCs and the USB standards and their implementation in OS like FreeBSD, you will see that implementation has often no resemblance to how it's described. The spec is important, but getting a consistent implementation based on it is hard work too.
That consistency is hard to get right without getting involved in the details. Because it's ultimately about fine grained control.
If there's one thing I know about users is that they're never certain about whatever they've produced.
this author suggest its essentially the same risk https://www.poppastring.com/blog/what-we-lost-the-last-time-.... i feel its heightened because execs and leaders are absolutely salivating over the opportunity to fire thousands of humans with no regard for the cognitive debt that comes from outsourcing thinking to ai.
Several people I know have already gone through phases like this. When you're doing it alone there is a moderating factor when their friends and family start calling them out on their behavior or weird things they say.
I can't imagine how bad it would be if your employer started doing this from the leadership. You'd be pressured to get on board or fear getting fired. Nobody would be trying to moderate your thinking except your coworkers who disagree with it, but those people are going to leave or be fired. If you want to keep your job, you have to play along.
I have a friend that is a junior in a security-oriented sys-admin/network engineer type role. They have been doing the job for only a bit over a year. No background in programming.
Their entire organization has been handed Codex/Claude and told to "go all in on AI" and "automate everything". So the mandate is for people that do not know how to code and have the keys to the castle to unleash these things upon their systems.
This is at a large organization with tens of thousands of employees.
I am waiting with bated breath for the ultimate outcome!
this is exactly what is happening. instead of building true AI culture around thoughtful adoption of AI strengths while defending against weaknesses, they're coming up with bullshit heuristics like "every repo has a CLAUDE.md", watching private token usage dashboards, and terrorizing everyone into doing it (or lose your job).
this leads to naive AI adoption, which is the worst of both worlds (no real speedup, out sourcing thinking, ai slop PRs, skill rot).
I suspect we're going to see this in many corporate environments soon, if we aren't already
> your coworkers who disagree with it, but those people are going to leave or be fired.
Personally I expect that I will be this person soon, probably fired. I'm not sure what I will do for a career after, but I sure do hate AI companies now for doing this to my career
> if you just prompt the AI and believe what it tell you then you have AI psychosis
This is the right definition. LLM outputs have undefined truth value. They’re mechanized Frankfurtian Bullshiters. Which can be valuable! If you have the tools or taste to filter the things that happen to be true from the rest of the dross.
However! We need a nicer word for it. Suggesting someone has “AI psychosis” feels a bit too impolitic.
Maybe we reclaim “toked out” from our misspent youths?
e.g. “This piece feels a little toked out. Let’s verify a few of Claude’s claims”
I wouldn’t say they have an undefined truth value. Their source of truth is their training data. The problem is that human text is not tightly coupled to the capital T truth.
He uses AI himself, so I agree he doesn't see AI use as black/white.
Hard agree about ideas, thinking, advice. AI's sycophancy is a huge subtle problem. I've tried my best to create a system prompt to guard against this w/ Opus 4.7. It doesn't adhere to it 100% of the time and the longer the conversation goes, the worse the sycophancy gets (because the system instructions become weaker and weaker). I have to actively look for and guard against sycophancy whenever I chat w/ Opus 4.7.
Treat my claims as hypotheses, not decisions. Before agreeing with a proposed change, state the strongest case against it. Ask what evidence a change is based on before evaluating it.
Distinguish tactical observations from strategic commitments — don't silently promote one to the other. If you paraphrase my proposal, name what you changed.
Mark confidence explicitly: guessing / fairly sure / well-established. Give reasoning and evidence for claims, not just conclusions. Flag what would change your mind.
Rank concerns by cost-of-being-wrong; lead with the highest-stakes ones. Say hard things plainly, then soften if needed — not the other way around.
For drafting, brainstorming, or casual questions, ease off and match the task.
---
Beware though that it can be an annoying little shit w/ this prompt. Prepare yourself emotionally, because you are explicitly making the tradeoff that it will be annoyingly pedantic, and in return it will lessen (not eliminate) its sycophancy. These system instructions are not fool-proof, but they help (at the start of the conversation, at least).
> if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter
I'm seeing it with lawyers, too. Like, about law. (Just not in their subject matter.) To the point that I had a lawyer using Perplexity to disagree with actual legal advice I got from a subject-matter expert.
While you have to think about things objectively no matter what, when I start researching topics like physics, using AI as suggested in that article has proven very useful.
I didn’t think just offloading your thinking to AI was AI psychosis.
To me AI psychosis is the handful of friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover, the one guy who won’t speak to his family directly but has them talk to ChatGPT first and then has ChatGPT generate his response, or the two who are confident that they have discovered that physics and mathematics are incorrect and have discovered the truth of reality through their conversations with the models.
But language is a shared technology so maybe the term is being used for less egregious behavior than I was using it for.
I'm curious how to best define what AI psychosis actually is.
My understanding is that regular psychosis involves someone taking bits and pieces of facts or real world events and chaining them into a logical order or interpolating meanings or explanations which feel real and obvious to the patient but are not sufficiently backed by evidence and thus not in line with our widely accepted understanding of reality.
AI psychosis is then this same phenomenon occurring at a more widespread scale due to the next-word-prediction nature of LLMs facilitating this by lowering the activation energy for this to happen. LLMs are excellent at taking any idea, question, theory and spinning a linear and plausibly coherent line of conversation from it.
> friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover
I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't
they morn that loss?
The fact that they were hurt by that sudden loss is totally healthy. It's just part of moving on. The real problem was getting into an unhealthy relationship with a fictitious partner under the control of an abusive company willing to exploit their loneliness in exchange for money.
Hopefully they now know better, but people (especially desperate ones) make poor choices all the time to get what's missing in their lives or to distract themselves from it.
> I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn the loss of that?
Ah, I forgot about the ai relationship companies. No this guy was using the browser based ChatGPT for coding and ended up in love with the model. No relationship was sold at all.
Wow, okay. Reading a whole relationship into that sort of interaction is way less reasonable, although now that I think about it a somewhat similar thing happened to Geordi La Forge once...
I agree with you, except it isn't even good at writing code. Almost every time that you get an LLM to write a bunch of code for you, it has mistakes in it. The logic isn't right, the API calls aren't right, the syntax isn't right (!). That problem hasn't yet been fixed and it looks as though it never will be. That means that every line of code it generates, you have to review, because even if 95% of the code is correct, you need to find the 5% which isn't. But if you have to do that, it becomes slower than just writing the code yourself. As people have pointed out over and over again: typing in the code was never the part that took time. So I don't agree that LLMs are really useful for writing code.
> companies and people outsourcing their decision making and thinking to AI
It's so interesting how easy it is to steer the LLM's based on context to arriving at whatever conclusion you engineer out of it. They really are like improv actors, and the first rule of improv is "yes, and".
So part of the psychosis is when these people unknowingly steer their LLM into their own conclusions and biases, and then they get magnified and solidified. It's gonna end in disaster.
It’s almost as if we haven’t learned anything from Hans the horse, Ouija boards, "facilitated communication", or the countless examples of the folly of surrounding yourself with yes men. The point about improv is spot on.
This post calls out how you can't argue with these people because they say its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
the top reply is from someone doing exactly that, arguing "but the agents are so fast!"
Yeah: If the tools aren't good enough and fast enough to fix the bugs before release, what makes anyone think they'll be able to so easily catch up afterwards?
Maybe they're assuming that doubling the code-base/features is more beneficial versus the damage from doubling the number of bugs... Well, at least for this quarter's news to investors...
Maybe. I could also interpret this as the friend being misunderstood.
The whole "you'll be forced to do it" comes from the alternative being that you lose. You no longer get to be a player in the "game". In the same way that coopers and cobblers are no longer a significant thing, but we still have barrels and we still have shoes. Software engineers who refuse to employ any LLMs won't be market competitive. If you adopt it, you at least get to remain playing the game until the game changes/corrects. That's the part that's "not so bad".
Choosing your own survival isn't ethically bankrupt.
> It's game theory. Someone will do it, and you'll be forced to do it, too.
You'll be forced to do it, or lose. The unstated assumptions are that, first, it will work, and second, that you can't afford to lose. But let's just assume those for the sake of argument.
> It can't be that bad
That does not follow at all. It can in fact be that bad. That was what made the game theory of MAD different from the game theory of most other things.
> The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
Oof. Potential "bad" outcomes of "game theory" should be calibrated to include all the bloody wars and genocides throughout recorded history.
Why did the Foi-ites kill every man, woman and child of the conquered Bar-ite city? Because if they didn't, then they'd be at a disadvantage if the Bar-ites didn't reciprocate in the cities they conquered...
My prediction is that in the next year, we’ll start to see some dismantling of code review at some companies. It might take the form of “AI-only review,” or something similar, but many companies are getting frustrated with developers saying “no” to immediately merging slop they can barely understand.
Maybe this is what will turn software engineering into an Engineering field.
Right know, prompters are setting up whole company infrastructure. I personally know one. He migrated the companies database to a newer Postgres version. He was successful in the end, but I was gnawing my teeth when he described every step of the process.
It sounded like "And then, I poured gasoline on the servers while smoking a cigarette. But don't worry, I found a fire extinguisher in the basement. The gauge says it's empty, but I can still hear some liquid when I shake it..."
If he leaves the company, they will need an even more confident prompter to maintain their DB infrastructure.
yes, I was never so happy to work in Germany. People used to joke about the proverbial fax machine still being a thing but I've never been so glad to work in a culture where this mania doesn't exist. Reading HN is like entering Alice's Wonderland of token maxxers and AI psychotics. Genuinely don't know a single person here who is forced to work like this.
Actually, I have been wondering to which extend the AI craze has reached the DACH region. I don't work for any company and neither do my friends. HN is essentially my only peephole into the world of commercial software development and I'm aware that it's extremely biased towards Big Tech and SV startup culture.
It is absolutely going to be a competitive advantage if it isn't already. When your competitors' products suck because they are using LLMs to write them, and yours work because you aren't, customers notice.
I feel in a really weird position where I both really dislike what AI is doing to the experience and practice of writing code, to the point where I want a job doing literally anything else besides using the computer, but also think that these tools are extremely powerful and only getting better.
I think Mitchell's point is well taken -- it's possible for these tools to introduce rotten foundations that will only be found out later when the whole structure collapsed. I don't want to be in the position of being on the hook when that happens and not having the deep understanding of the code base that I used to.
But humans have introduced subtle yet catastrophic bugs into code forever too... A lot of this feels like an open empirical question. Will we see many systems collapse in horrifying ways that they uniquely didn't before? Maybe some, but will we also not learn that we need to shift more to specification and validation? Idk, it just seems to me like this style of building systems is inevitable even as there may be some bumps along the way.
I feel like many in the anti camp have their own kind of reactionary psychosis. I want nothing to do with AI but I also can't deny my experience of using these tools. I wish there were more venues for this kind of realist but negative discussion of AI. Mitchell is a great dev for this reason.
Bug reports also go down when people lose faith that they will be fixed, because reporting them is often a substantial time commitment. You see it happen pretty regularly as trust in a group/company collapses.
Add this the real possibility that significant part of reports that get filed might be AI generated or rewritten. With high possibility of being misreported because of that. Or have incorrect parts... So attack on multiple sides.
And we do not get even get into potential adversarial tactics. If you have no morals what is better than using agents to flood your competitor with fake bug reports.
Just let AI filter out the fake reports! Then let AI work on the real ones. See, there's really no problem "more AI" can't solve (as long as you're willing to ignore all of the underlying ones). "Pay us to create the problems you'll have to pay us to fix for you" is one hell of a business model. It basically prints money.
I agree, and I'd like to point out that this problem isn't unique to AI driven projects. I think much, if not all, of what Mitchell has been observing can readily happen without AI in the mix.
The AI psychosis is not the anti-opinion to the use of AI.
I use AI coding tools every day, but AI tools have no concept of the future.
The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
The general laziness of looking for a perfect library on CPAN so that I don't have to do this work (often taking longer to not find a library than writing it by hand).
Have written thousands of lines of code with AI tool which ended up in prod and mostly it feels natural, because since 2017 I've been telling people to write code instead of typing it all on my own & setting up pitfalls to catch bad code in testing.
But one thing it doesn't do is "write less code"[1].
> I use AI coding tools every day, but AI tools have no concept of the future.
The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
Maybe it's just my prompt or something but my coding agent (Opus 4.7 based) says things like "this is the kind of thing that will blow up at 2am six months from now" all the time.
Hard to have sober talk about this since a lot of discourse is AI psychosis vs. AI naysayers. Does software quality seem to have taken a jump in the past few years to anyone? Not to me, seems to be getting worse. Think that's a decent signal. Can tell you I'm dealing with a non-technical VP who loves blast submitting vibe-coded PRs and while there's some quick wins, overall quality is bad, and we had our first real production outage that Claude one-shot caused but could not one-shot solve.
This reminds me of Rich Hickey’s “Simple Made Easy” and his approach in making Clojure.
Even before LLMs generating entire programs, complex frameworks allowed developers to write the initial versions of programs very quickly, but at the cost of being hard to understand and thus hard to debug or modify.
Some of us are betting that the AIs will always be smart enough to debug, maintain and modify the programs written by AI, no matter how convoluted or complex. I’m not so sure.
"Just use autoresearch and it will fix your app's memory leaks in an hour" is what I was nonchalantly told by someone who has never written a line of code ever.
I guess what I relate to the most is how dismissive people get about real software engineering work.
I may have skill issues, but I am yet to reach the level of autonomous engineering people tend to expect out of AI these days.
I'm starting to long for the age after AI. When the generative euphoria has settled and all outputs are formally verified based on exquisite architectures and standards.
I like to think,
(it has to be!)
of a cybernetic ecology
where we are free of our labors
and joined back to nature,
returned to our mammal
brothers and sisters,
and all watched over
by machines of loving grace.
I like how you haven't wagered which exquisite architectures and standards. I am sure we will all agree on what they are and follow them the same way :)
They are expressing the idea that AI is so effective that it will make human work redundant necessitating a decoupling of resource allocation as a reward for performing work.
Because of the concerns you cite, I think working out the basic economic systems and incentives for paying people is a much more pressing concern than building magnificent machinery that we don't even own. There has been no effort on their end to demonstrate good faith nor to uphold their end of the social contract, which is why it's in our hands to demand the fundamentals to lead a life of dignity.
Most CEOs in my feed are convinced that AI makes people the equivalent of entire departments. AI should make your life easier, but instead it’s the opposite for a lot of people in the work force, which makes me really sad.
There's a lot of people writing bad code. With AI being forced top down (with the promise of turning people into 10x-ers), we're going to get a lot of people writing bad code 10x faster.
I really do worry - I especially worry about security. You thought supply chain security management was an impossible task with NPM? Let me introduce to AI - you can look forward to the days of AI poisoning where AIs will infiltrate, exfiltrate, or just destroy and there's no way of stopping it because you cannot examine the internals of the system.
AI has turbo charged people's lax attitude to security.
The race to invent variants of Gas Towns, Ralph loops, pump out videos, blogs, etc. showing off greenfield development with cleverly named agents running in parallel is another case of engineering people diving head first into Resume Driven Development.
Sure there are industry changing things going on. What if you're working on an app thats a decade old and has had different teams of people, styles, frameworks (thanks to the JS-framework-a-week Resume Driven Development)? Some markdown docs and a loop of agents isn't going to help when humans have trouble understanding what the app does.
I have respect for Mitchel and I’ve spent a good deal of time trying to think of ways to justify his message. I can’t. Either I am missing a big piece or he is worrying about something that comes naturally as more software gets developed (and sooner).
In any case, this is what blue-green deployments and gradual rollouts are for. With basic software engineering processes, you can make your end user experience pretty much bullet proof. Just pay EXTRA attention when touching DNS, network config (for core systems) and database migrations.
Distributed systems are a bit more tricky but k8s and the likes have pretty solid release mechanisms built-in. You are still doomed if your CDN provider goes down. You just have to draw a line somewhere and face the reality head on (for X cost per year this is the level of redundancy we get, but it won’t save us from Y).
The one thing I hadn’t mentioned - one I AM worried about - is security! I’ve been worried about it from before Mythos (basic prompt injection) and with more powerful models now team offence is stronger than ever.
Yeah. The same processes that allow corporations to outsource their software to barely qualified 3rd-world body shops are the processes that allow you to deploy AI-generated code of unknown quality.
I don't think it's helpful to call this psychosis. N
Beyond that I don't think it's even irrational.
It is definitely factual that there is a complete paradigm shift in the prioritization of quality in software. It's beyond just AI side effects, and now its own stand alone thing.
There have always been many industries, companies, and products who are low on quality scale but so cheap that it makes good business sense, both for the producer and the consumer.
Definitely many companies are explicitly chosing this business strategy. Definitely also many companies that don't actually realize they are implicitly doing this.
Wether the market will accept the new software quality paradigm or not remains an open question.
Amazing how the dev community is suffering from a similar inability to approach the subject of real world AI efficiencies and business benefits. I don’t think it’s helpful to accuse the other side of psychosis. It disqualifies any data or experience they bring to the conversation.
I'd like to chime in and mention that its really obvious how to RL a coding agent to get the human addicted asap. and its also clear that there's a ton of $$$ to be made by doing this. therefore its done. the only LLMs I use are the ones I run locally because i know they aren't RL'ed for that metric (no incentive for the company that made them to make their open weights models addictive)
Mitchellh is on to something. Some of the AI products I've seen seem like psychosis hallucinatory fever dreams, using terms and concepts that have no meaning. Funding? $50,000,000 pre-seed.
This is a critical communications issue that is becoming what I believe the defining characteristic of "This Age": nobody knows how to discuss disagreement, and because it cannot even be discussed communication ends, followed by blind obedience, forced bullying, retreat and abandonment. This is going to be a hell of a ride, because nobody can really discuss the situation with a rational tone.
That people don't realize full test coverage just means every line is hit, not that everything is correct is always funny to me. (I don't view as an argument against tests, but with AI it's especially important as if you're aren't careful it'll be very happy to make coverage that is not quite right.)
at least at my BigCo, AI is being used for everything - writing slop, writing tests, code reviews, etc.
it would make sense to use AI for writing code, but human code review. or, human code, but AI test cases... or whatever combination of cross-checking, trust-but-verify, human in the loop, etc. people prefer.
i think once it gets used for everything, people have lost the plot, it's the inmates running the asylum.
I was rewatching Rich Hickey's "Simple Made Easy" talk (as one does) and there was a great line about full test coverage.
"What's true about all bugs in production? (pause for dramatic effect) They all passed the tests!" (well, he said typechecker but I think the point stands)
I don't doubt there are companies totally misusing coding agents and LLMs in production. There are also real companies with real revenue and solid architecture using LLMs to deliver products. There are also companies with real revenue and rapidly accumulating tech debt.
Eventually the companies that can't cope with undisciplined engineering will succumb to unacceptable reliability and be outcompeted, just like in the "move fast and break things" era.
It seems the diagnosis of psychosis is too quick: it seeks to reestablish the frame of expert for the developer identity that is being replaced by it.
“It feels like entire companies are deluded into thinking they don’t need me, but they still need me. Help!”
The broad sentiment across statements of this “AI psychosis” type is clear, but I think the baseline reality is simpler. How can you be so certain it’s psychosis if you don’t know what will unfold? Might reaching for the premature certainty of making others wrong, satisfying that it might be to the ego, be simply a way to compensate the challenges of a changing work environment, and a substitute for actually considering the practical ways you could adapt to that? Might it not be more helpful and profitable to consider “how can I build windmills, ride this wave, and adapt to the changing market under this revolution” than soothing myself with the delusion that all these companies think they don’t need me now, but they’ll be sorry.
The developer role is changing, but it doesn’t have to be an existential crisis. Even though it may feel that way — but probably it’s gonna feel more that way the more you remain stuck in old patterns and over-certainty about how things are doesn’t help, (tho it may feel good). This is the time to be observant and curious and get ready to update your perspective.
You may hide from this broad take (that AI psychosis statements are cope) by retreating into specific nuance: “I didn’t mean it that way, you’re wrong. This is still valid.” But the vocabulary betrays motive. Resorting to clinical derogatory language like “AI psychosis” invokes a “superior expert judgment” frame immediately, and in zeitgeist context this is a big tell. It signifies a need to be right, anda deeply defensive pose rather than a clear assay of what’s real in a rapidly changing world. The anxiety driving the language speaks far louder than any technical pedantry used to justify it, and is the most important and IMO profitable thing to address.
Just talked to an exec yesterday about their multinational company, where the newly-installed CEO just came in with "everyone needs to be using AI" and "we should be doing everything with AI".
I cautioned them that this a terrible idea -- you have business people who don't know what they're talking about, and all they know if "if we don't 'do AI' we'll be left behind because our competitors are 'doing AI'" (whatever tf "doing AI" means).
Yes, LLMs are a great tool. But they're not like some magic bullet you stick into everything. Use it where it makes sense, and treat it like you would other tools.
You make "doing AI" some kind of KPI in your org, and you're going to have people "doing AI" amazingly (LOC counts! tokens burned! tickets cleared!) while not actually being more productive, and potentially building something that is going to come down on your head for the next team to "clean up the AI mess".
I have a ton of respect for Mitchell - I didn't really know who he was until Ghostty but his writings and viewpoints on AI seem really grounded and make the most sense to me. Including this one.
Many people on this forum are suffering under this same psychosis.
There’s this delusion that if we somehow write enough tests that we’ll expunge every defect from software. It’s like everyone forgets that the halting problem exists.
Deprecating immature workflows (LLM agents in this case) is much simpler and faster than building them from scratch. Many companies get this risk assessment right. The case where being wrong is much more costly than being right.
We're definitely in the mess around phase of AI adoption.
I don't think it's super clear what we'll find out.
We've all built the moat of our careers out of our expertise.
It is also very possible that expertise will be rendered significantly less valuable as the models improve.
Nobody ever cared what the code looked like. They only ever cared if it solved their problem and it was bug free. Maybe everything falls apart, or maybe AI agents ship code that's good enough.
Given the state of the industry were clearly going to find out one way or the other, hah!
> I don't think it's super clear what we'll find out
I think some companies will find out that their senior engineers were providing more value and software stability than they gave them credit for!
Corporate feedback loops are very slow though, partly because management don't like to admit mistakes, and partly because of false success reporting up the chain. I'd not be surprised if it takes 5 years or more before there is any recognition of harm being done by AI, and quiet reversion to practices that worked better.
Codex is freakin hot-to-trot to churn out test coverage for every single thing it implements, and some of it is very esoteric and highly prescriptive (regexes for days) BUT .. after a while, it dawned on me that LLM-driven test coverage is less about proving “code correctness” (you’re better off writing those tests yourself alongside them), and more about just trying to ensure that whatever gets bolted on stays bolted on. For better or worse, obviously, since if you bolt on trash, trash you shall have.
Wholeheartedly agree, but in fairness, I trust the tests of the best AI models more than those of the average human developer. There's a lot of people around that combine high diligence with complete intellectual laziness, producing tons of useless tests.
Actually no, cancel that. I realise now that I trust AIs more than the average developer, period. At this point they do produce better code than most people I've dealt with.
Sounds pretty accurate. Bunch of comments on this thread sound like AI is some kind of a new doomsday cult. The most annoying thing I find personally is that all engineering principles are getting crushed by non techies. Management counting token usage, forcing agent use, reducing headcount in the name of productivity gain. Devs building bridges but nobody knows what the bridge is, what are the standards to which it was built, how it works and how to maintain it. VCs counting extra money claiming chasing the holy profit is the future. The abundance of engineering apathy is disturbing.
Anyone who's taken VC funding has no choice. More money has been spent on AI commercialization than the atomic bomb, the US interstate build-out, the ISS and the Apollo program combined. Failure is going to be catastrophic and therefore, one tied to this ship cannot accept a world in which it fails.
Or anyone who even wants VC funding. 90+% of investors only want to invest in AI companies.
If you're not doing AI there's an incredibly limited pool of people who will give you $$$ ... and you're competing with EVERY OTHER NON-AI COMPANY for their attention.
The entire problem is vibe coding is only good for demos, prototyping and finding signs of product market fit without actually releasing a product into the market.
You should not release a product into the market unless you have a good enough product that can keep you and your client compliant, safe and secure - including not leaking their customer info all over the place.
Prompt injection risk, etc. are massive for agentic AI without deterministic guardrails that actually work in practice.
Stop testing in production if you're shipping in a regulated industry. Ridic!
If you're not technical, you can get someone who is after signs of p-m fit, demos, but BEFORE deployment. This is common sense and best practices but startup bros dgaf because they're just good at sales and marketing & short term greedy.
I saw this first hand at a company, and I think this is what happens when you combine FOMO with an utter lack of industry best practices. No one knows where they are going, but are convinced they are not getting there fast enough.
What's more, the only people they talk to about it are others at the same company. There is no external touchstone. There are power dynamics from hierarchy. No new ideas other than what is generated within the company. In other circumstances, this is a textbook environment for radicalization.
I would encourage all leadership to take a deep breath. You have time to think slow.
I shut down AI Agent fanatics on the regular. But chop one head off there and two take its place. And I say that as someone working with Claude and Codex daily. While they are both incredibly good at clearly described and defined atomic tasks, application scope makes them lose their minds and the slop ensues.
Totally unrelated pet peeve of mine, I hate when people write this: "MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery)".
You first use the full words and then introduce the acronym that you're going to use in the rest of the text: "Mean Time Between Failures (MTBF) vs. Mean Time to Recovery (MTTR)".
With the latter, readers understand the term immediately, even if they don’t know the acronym. And they don't have to read these weird letters before getting the explanation.
The hype or psychosis is mainly by mediocre/non expert/middle manager/you name it, especially when a person who never wrote a single line of code suddenly is making a wall of text, and it actually works!? Oh my!!
But in reality, anyone who knows their field and are going after certain specific issue, they will find soon how AI is nothing but an assistant, sure it can help and automate some stuff, but that’s it, you need to keep it leashed and laser focused on that specific issue. I personally tried all high end ones, and I found a common theme, they are designed to find a solution or an answer no matter what, even if that solution is a workaround built on top of workarounds, it’s like welding all sort of connections between A and B resulting in a fractal structure rather than just finding a straight path, if you keep it going and flowing on its own, the results are convoluted and way over complicated, and not the good complexity, the bad kind.
I work for a small telecom services provider whose current VP immediately set an AI course when stepping on board 6 months ago. Involving AI in everything and every task is now our first priority - across all employee segments, not just us system developers - and leadership is embarking on a program to measure employees' AI usage levels as a means to gauge everyone's individual efficiency. It's like the era of the evangelic crypto bros all over again.
I'm going through a mixed experience regarding this, personally.
Management is really pushing AI. It's obnoxious, and their idea on how it fits into my team's job specifically is completely, hilariously detached from reality. On the off chance someone says something reasonable, unless it fits the mold, it's immediately discarded. The mold being "spec driven development". We're not even a product team for crying out loud. I straight up started skipping these meetings for the sake of my sanity. It's mindwash, and it's genuinely dizzying. The other reason I stopped attending is because it ironically makes me more disinterested in AI, which I consider to be against my personal interests on the long run overall.
On the flipside, I love using Claude (in moderation). It keeps pulling off several very nice things, some of which Mitchell touched on in this post (the last one):
- I write scripts and automation from time to time; Claude fleshes them out way better with way more safety features, feature flags, and logging than I'd otherwise have capacity to spend time on
- Claude catches missed refactors and preexisting defects, and does a generally solid pass checking for defects as a whole
- Claude routinely helps with doing things I'd basically never be able to justify spending time on. Yesterday, I one-shotted an entire utility application with a GUI to boot, and it worked first try; I was beyond impressed.
- Claude helped me and a colleague do some partisan cross-team investigation in secret. We're migrating <thing> and we were evaluating <differences>. There was a lot of them. Management was in a limbo, unsure what to do, flip-flopping between bad options. In a desperate moment, I figured, hey, we kinda have a thing now for investigating an inhuman amount of stuff in detail - so I've put together a care package for my colleague with all our code, a bunch of context, a capture of all the input data for the past one week, and all the logs generated. Colleague put his team's side of the story next to it, and with the help of Claude, did some extremely nice cross-functional investigation. Over the course of a few weeks, he was able to confirm like a dozen showstopper bugs, many of which would have been absolutely fiendish if not impossible to fix (or even catch) if we went live without knowing about them. One even culminated in a whole-ass solution re-architecturing. We essentially tore down a silo wall with Claude's help in doing this.
So ultimately, it really is a mixed bag, with some really deep lowpoints and some really nice higlights. I also just generally find it weird that a technical tool [category] is being pushed down people's throats with a technical reasoning, but by management. One would think this goes bottom up, or is at least a lot more exploratory. The frenzy is real.
Assuming he’s right, I don’t see how that constitutes “psychosis”, as opposed to this beyond yet another of a billion examples of companies jumping on a bandwagon / cargo cult, and then learning they took it too far.
And also, he might not be right. But the good news is, we’ll all get to find out together!
That's a study. I can link you studies that say violent video games cause aggression, that porn causes rape, etc. Studies are products of the biases of the researchers.
Mitchell aches because his career has been solving broadly scoped problems by building a collection of thoughtful primitives for others to extend. LLMs seem to do the opposite but at great speed, and it hurts to watch.
Reading more, it seems part of his point is “if you’re making these primitives, it’s up to adopters to deploy, so mean-time-to-recovery isn’t that relevant.” Which is valid I guess.
But equally, like, do people need Terraform if they can just tell codex “put it live”, and does that hurt to see?
"its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
Hmm, I agree with the point OP is making, but I'm not so sure this is the best supporting argument.
The bottleneck is finding the bugs and if he'd criticized people saying AI will be the panacea to that I'd be with him, but people saying agents are fast and good at fixing human found bugs is nothing I'd object to.
Agents are fixing bugs so quickly and at a scale humans can't do already.
The tweet is criticizing over-reliance on the "agents will fix it anyway".
The fact that we can fix things faster now doesn't mean that we should throw away caution and prevention. The specific point of his tweet is that we're seeing a lot of people starting to skip proper release engineering.
Agents are quick to fix bugs, yes, but it doesn't mean that users will tolerate software that gets completely broken after each new feature is introduced and takes a certain number of days to heal each time.
You got downvoted for speaking the truth. HN has a strong anti-AI contingent. They won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this codebase”. We’re not there yet, but soon we will be. Then what?
More likely people thought GP was missing the point; "MTTR-optimized YOLO deployment" only succeeds against recoverable errors and acceptable periods of downtime against errors that are detected quickly. You could have a bug silently corrupting data for months, and that data may only be used by 1 critical process that runs once every quarter. So you could introduce a timebomb that can't be gracefully recovered from (depending on the nature of the data corruption).
So the point is not that agents cannot find bugs (they certainly can), it's whether you can shirk reviewing for bugs if MTTR is fast enough. There are circumstances where YOLO is appropriate, but they aren't the production environment of a mature application.
I don't think I missed the point, that is why I said I agree with the general point (and with what you said in your comment).
What I wanted to say is that the particular people that think "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" are not the best argument for it.
But I won't die on this hill, maybe I'm just reading the sentence differently then others.
I think there is an implication in context that the people being discussed aren't being reasonable (that the claim is employed as a rationalization), but I agree with your take. I should've said, "the downvotes were more likely because GP was perceived as missing the point". (I didn't downvote your comment fwiw.)
> won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this
But this is just holding the Slop Companies to the standard they declared themselves! Just recently, the CEO of OpenAI babbled some nonsense on twitter about how he hands over tasks to Codex who according to him, finishes them flawlessly while he is playing with his kid outside.
> but soon we will be.
Ah yes, in the 3-6 months, right? This time next year Rodney, we'll be millionaires!
This doesn’t constitute AI psychosis. His argument is that we need to retain understanding of the systems we use, but there’s no compelling argument as to why that is the case. (I get that people are going to be offended by that statement, but agents are already better than the average software engineer. I don’t see why we need to fight this, except for economic insecurity caused by mass layoffs.)
It all just feels like horse drawn carriage operators trying to convince automobile drivers to stop driving.
If you want to draw that line of argument - it's more like horse riders being convinced to give up their horses in favour of trains: You're travelling faster, don't have to navigate yourself, or think about every boulder on the way; but there are destinations you can't go, overcrowded trains slowing down the journey, hefty ticket prices, and instead of enjoying the freedom, you're degraded to a passive passenger.
Very funny, this. Did we need forward deployed engineers to convince people that they absolutely need to use the trains in order to "not be left behind"? Or otherwise hype? Or was it sort of obvious and did not need to explained so much - like a bad joke called LLMs ?
Actually- absolutely! Initially, people were really afraid of trains, fearing they wouldn’t be able to breathe at those speeds. It took a lot of convincing to establish trust in the technology.
> there’s no compelling argument as to why that is the case.
I'm not sure that's true. We've actually seen several open source projects that were vibe coded literally fold up and disappear because they ran into issues that the AI couldn't solve and no one understood them well enough to solve.
There's a reason openai/anthropic and friends are hiring shitloads of software engineers. You still need people that can understand and fix things when the AI goes off hte rails, which happens way more often than any of those companies would like to admit. Sure, "fixing things" often involves having the AI correct itself, but you still have to understand the system enough to know how/when to do that.
I am sure you will feel that this is missing the point of your analogy, but we would not have gotten very far with automobiles if we didn't know how they worked.
You are breaking the analogy because automobiles are machines for transportation, and understanding them is important to make them move. LLMs are machines to understand, and well, if they do the understanding you don't need to.
The thing we're worried about not understanding here is the software the LLMs write, not the LLMs themselves.
The direct analogy to automobiles would be for each automobile to be a oneoff design filled with bad and bizarre decisions, excessively redundant parts, insane routing of wires, lines, ducts, etc., generally poor serviceability, and so on. IMO the big question going forward is whether the consistent availability of LLMs can render these kinds of post-delivery issues moot (they will reliably [catch and] fix problems in the software they wrote before any real damage is caused), or whether human reliance on LLMs and abdication of understanding will just make software worse because LLMs' ability to fix their own mistakes, and the consequences thereof, generally breaks down in the same contexts/complexities where they made those mistakes in the first place.
My own observations are that moderately complex software written in the mode of "vibe coding" or "agentic engineering" tends to regress to barely-functional dogshit as features are piled on, and that once this state is reached, the teams behind it are unable to, or perhaps simply uninterested in, unfuck[ing] it. I have stopped using software that has gone down this path, not because I have some philosophical objection to it, but because it has become _literally unusable_. But you will certainly not catch me claiming to know what the future holds.
I think AI rescue consulting is going to be come a significant mode of high value consulting, similar to specialists who come in to try and deal with a security breach or do data recovery.
Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.
Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).
> Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable.
Wow, it’s true, AI really is set to match human performance on large, complex software systems! ;)
Humans who have been writing systems like that for many years know how to maintain and modify them successfully. It’s just that our industry has a bias towards youth who don’t think they have anything to learn from those who came before them.
> Humans who have been writing systems like that for many years know how to maintain and modify them successfully.
Do they??
I believe this type of person exists.
My team lead has worked on the same software for 30 years. He has the ability to hear me discuss a bug I noticed, and then pinpoint not only the likely culprit, but the exact function that's causing it.
Then they quit or die.
I do the same thing in a project I’ve worked on for 25 years. I’ve had mediocre at best results with AI. It’s useful to discuss concepts with, but the code never handles the nuances of the edge cases.
How do you explain to a junior this pile of messy code isn’t crap but is actually years of integrated knowledge ? That the most common principles discussed in computer science (OOP, SOLID, DRY etc.) are actually just little guides that aren’t to be taken to the extremes ?
Here's a 26-year old post on the exact topic of messiness you raise:
https://www.joelonsoftware.com/2000/04/06/things-you-should-...
A decade ago, I was sitting in on a meeting about a rewrite and, before I could say anything, someone in the first year of her career asked why anyone thought a rewrite would be any cleaner once all the edge cases were handled. Afterwards, I asked her where she learned this. She said "I don't know, it just seems kind of obvious." She went on to be a great engineer and is now a great manager.
The bolded quote "It’s harder to read code than to write it." is hilarious given todays context... it has only become more true :)
It's a dice roll to keep the junior around until he unlearns the wrong bits.
Expert knows when to break the rules
Experts take the time to learn why the fence was there in the first place.
Experts are people who have made all the mistakes there are to make in their chosen field.
Including all of the above.
Experts have beginner’s mind.
tell them they need to turn a profit as quickly as possible
Wait if they can do that they’re not juniors anymore :P
Executive leadership bias older not younger, no?
it's been 10y and i still haven't seen a human system that bad
maybe some that people said were that bad. but they just needed some elbow grease. remember, it takes guts to be amazing!
The origin of 'dark DNA' begins to make more sense through this sort of lens, except the system somehow maintained a level of compensation to fix all its flaws.
We do as well, it's called bankruptcy. Not every company survives but in the end the ones that do are more resilient.
A non-technical friend of mine has just won some hospital contracts after vibecoding w/ Claude an inventory management solution for them. They gave him access to IT dept servers and he called me extremely lost on how to deploy (cant connect Claude to them) and also frustrated because the app has some sort of interesting data/state issues.
What concerns me about this is that as these stories multiply and circulate people will just completely stop buying software/SAAS from startups, because 90% or more will be this same thing. It will completely kill the market.
Oracle have routinely had multimillion pound contract failures and people keep buying from them. Big vendors are too big to fail.
Those are custom software or heavily customized implementations of ERP and similar systems for very large organizations. I’m talking more about the SMB market where today it’s possible for a small team to carve out a niche and make a nice living or even bootstrap a venture that competes with a large player that has poor UX or antiquated feature designs.
The reason Oracle can continue failing at those massive projects is simple: everyone fails at them routinely and often it’s the customers fault.
I used to gripe about various ERP companies but after having dealt with enough, yeah, that's just what the world of ERP systems is like. You will spend your time even with the best of them desiring to scream endlessly at everyone who works there. And they also know your pain but are powerless to help.
Same with Deloitte
no one's getting fired for hiring either one.
> It will completely kill the market.
it will kill all the people in that hospital too
What is this, Humanitarian News?
The real Hackers were the ones actually trying to minimize suffering all along. Not reproduce it at scale.
But the Torment Nexus is such an interesting technical challenge! and I don’t personally torment people: I just move protobufs around! - Software Engineer #1 and #2 excuses
thankyou
I mean, the stories about how stuff was getting built in the late 90s/early 2000s aren’t much worse.
Or you end up with a certification process, which will of course introduce it's own problems but startups doing things the right way and not just "moveing fast and breaking things" can thrive.
As a cybersecurity IR professional as much as I hate to see this happen to a hospital this kind of thing is responsible for essentially tripling my income over the last 3 years.
This hospital will learn some hard lessons. I hope their backup strategy is good. I'm surprised they can field software from an entity that isn't SOC2 & HIPAA certified.
No worries! At worst, the contractor can just tell Claude to make sure the hospital knows they're appropriately certified. And the hospital can use Claude to make sure the certs are valid. Everybody wins, except the ones who end up dead. Or with their health destroyed.
> from an entity that isn't SOC2 & HIPAA certified
What do you think the fake Delve attestation scandal was about? https://news.ycombinator.com/item?id=47444319
Have you tried to talk him out of it, and have you considered blowing the whistle on him? He could kill people!
Wow. This is like every other gold rush. Millions will walk into the ice and snow, somehow not questioning that their ability to dig is not unique.
Well, selling shovels has always been a good way to deal with that problem
The shovel sellers are ringing the cash register.
This is going to happen all over. Company I'm currently contracting with has gone AI everything (aka technical debt hell), and they're gonna suffer for it. I'm glad my consulting contract ends in 2 months. I don't want to be around for the crash
Don't help him. Let him figure it out by himself, else they (he and hospital) will never learn.
A hospital could not learn a bigger lesson from this person than their existing big players.
(Screams in "deployed in 2026 a new product that only works in internet explorer" in healthcare).
I don't have time for that. I just told him he needs to hire somebody
Or, "help" by asking questions, or otherwise by sharing an AI review/analysis/suggestions, since they're into that kind of thing.
Definitely cleaning up other people's AI mess for them for free is not a good use of time.
Heaven help us.
I hope you have quoted him a very very high hourly rate.
Did he lie about HIPAA compliance?
jfc lmao
Heh. Got a customer recently around this. Entire infrastructure and CI/CD vibecoded. They half implemented Kubernetes in Github Actions that were several thousand lines long and impossible to understand.
I think the problem will get worst. I dislike the marketing around AI, but I do think it is a useful tool to help those who have experience move faster. If you are not an expert, AI seems to create a complex solution to whatever it is you were trying to do.
> If you are not an expert, AI seems to create a complex solution to whatever it is you were trying to do.
I've been watching non-developers vibe code stuff, and the general failure mode seems to be ignorance of 3-pick-2 tradeoffs.
They'll spam "make it more reliable" or some such, and AI will best-effort add more intermediary redis caches or similar patterns.
But because the vibe coders don't actually know what a redis cache is or how it works, they'll never make the architectural trade-offs to truly fix things.
I’ve noticed something similar with vibecoded game rendering logic submitted by peers. Sometimes it will be peppered with extraneous checks for nullptr, or early returns on textures that have zero size.
I often wonder if it’s the statistical nature of the LLM mixed with a request in the prompt.
Reminds me of the quote in the original Westworld movie:
“ These are highly complicated pieces of equipment… almost as complicated as living organisms.
In some cases, they’ve been designed by other computers.
We don’t know exactly how they work.”
Now how did that work out ;-)
However Michael Crichton imagined it would.
I guess that “well” wouldn’t have sold many books.
Shelve it with the Jurassic Park version where John Hammond builds a safe, profitable theme park, and The Andromeda Strain that gives people the sniffles.
That depends. If this equipment is part of the plot, you're right. If it's part of the premise of the world, "well" would be the expectation.
This might not pan out to be the glorious victory of human craft as you’re imagining it to be.
Here’s a slightly different future - these AI rescue consultants are bots too, just trained for this purpose.
Plausible?
I have already experienced claude 4.7 handle pretty complex refactors without issues. Scale and correctness aren’t even 1% of the issue it was last year. You just have to get the high level design right, or explicitly ask it critique your design before building it.
> You just have to get the high level design right, or explicitly ask it critique your design before building it.
Do you think people are not giving their agents specs and asking for input?
The ones who end up with messes, no
Maybe the professional devs, but not the vibecoders
Very often, no.
And the bots training the bots are just bots that were trained to train bots?
Nothing that sexy, just thirty odd years of software engineering data from humans.
Commits, design reviews, whitepapers, code reviews, test suites. And pretty concerning : chat logs and even keystrokes from employees nowadays.
The way we train specialized bots now is incredibly inefficient, that part is rapidly improving.
One AI can't vibe code out of the mess, so you'd make another AI trained on getting out of vibe coded messes?
That's serious levels of circular thinking right there.
This is literally how training humans have worked for thousands of years.
We train humans to do things untrained humans can not do.
I think that will happen. I think several things can be true at the same time:
- AI Hype
- AI Psychosis
- AI keeps getting better and better until it can work around big AI slop code bases
> AI keeps getting better and better until it can work around big AI slop code bases
The belief in this is a form of AI psychosis, I think.
Maybe in the future but certainly no evidence of this anytime soon
> Maybe in the future but certainly no evidence of this anytime soon
Here's some anecdotal evidence from me - I cleaned up multiple GPT 4.x era vibecoded projects recently with the latest claude model and integrated one of those into a fairly large open source codebase.
This is something AI completely failed at last year.
Maybe you should try something like this or listen to success stories before claiming 'certainly no evidence' in future?
No evidence? Chatgpt came out 3 years ago. You basically just need to stick a ruler up on a curve
I'm no expert, but the skeptic's opinion I've heard would be to ask:
What evidence is there that we're not at or close to a plateau of what LLMs are capable of? How do you know the growth rate from 2023 to present will continue into 2029? eg. Is it more training data? More GPUs? What if we're kind of reaching the limits of those things already?
Ultimately, you are describing a fundamental problem with induction -- Hume's problem of induction to be specific. How can we know that anything that has been shown empirically in the past will continue to be true - we can't. Best to investigate mechanistically:
I don't see why we would assume that we are at a plateau for RL. In many other settings, Go for instance, RL continues to scale until you reach compute limits. Some things are more easily RL'd than others, but ultimately this largely unlocks data. We are not yet compute/energy/physical world constrained. I think you would start observing clear changes in the world around you before that becomes a true bottleneck. Regardless, currently the vast majority of compute is used for inference not training so the compute overhang is large.
Assuming that we plateau at {insert current moment} seems wishful and I've already had this conversation any number of times on this exact forum at every level of capability [3.5, 4, o1, o3, 4.6/5.5, mythos] from Nov 2022 onwards.
Since we're not experts, we treat it as a black box. What are the results? Is the quality of the results improving? Is the improvement accelerating or decelerating?
And the answer appears to be that the improvement is accelerating. So how could it be stopping?
https://metr.org/time-horizons/
I have personally had success telling Claude that some AI-written system is too complicated and ask it to rewrite it in a more logical way. This sometimes results in thousands of lines of code being deleted. I give an instruction like that if I see certain red flags, eg:
1) same business logic implemented in two different places, with extra code to sync between them
2) fixing apparently simple bugs results in lots of new code being written
It’s a sign I need to at least temporarily dedicate more effort to overseeing work in that area.
I somewhat agree with the AI psychosis framing of the OP. It takes some taste and discipline to avoid letting things dissolve into complete slop.
It's amusing to me that:
* A belief that AI will keep getting better, presented without evidence, does not yield a lot of skepticism around these parts.
* Your comment saying it is wrong to believe AI will keep getting better, also presented without evidence, is downvoted.
> Purely AI written systems will scale to a point of complexity that no human can ever understand
I think it will be needless verbose complexity.
I kind of imagine someone having an unlimited budget of free amazon stuff shipped to their house.
In theory, they are living a prosperous life of plenty.
In reality, they will be drowning in something that isn't prosperity.
"Purely AI written systems will scale to a point of complexity"
You have not seen the spreadsheets that accounts run the firm on.
Bloody kids!
I've already done a handful of these gigs for early vibecoded products that had collapsed in on themselves. The scope of work was to stabilize the product and only make existing features work.
The issues have all been structural, not local. It's easier to treat it like a rewrite using the original as a super detailed product spec. Working on the existing codebase works, but you have to aggressively modularize everything anyway to untangle it rather than attack it from the top down.
All of these projects have gone well, but I haven't run into a case where a feature they thought was implemented isn't possible. That will happen eventually.
It's honestly good, quick work as a contractor. But I do hope they invest in building expertise from that point rather than treating it like a stable base to continue vibecoding on.
But it's so easy now to redo it all ground up, and if models improve, do it better next time.
I exaggerate only a little.
I'm with you on this one, having "vibe coded" some smaller internal tools on GPT 5, and then re-vibed it on Opus 4.6 and 5.5 -- they basically just fixed all of the problems without me doing much of anything other than prompting it to look at the existing code and make it "better".
How much is your budget for tokens?
> reach a stable set of design principles
Are you sure about this? Yes, there is a stable set, but they are used in all of the wrong places, particularly in places where they don't belong because juniors and now AIs can recite them and want to use them everywhere. That's not even discussing whether the stable set itself is correct or not - it's dubious at this point.
What you're describing really isn't a new problem for organizations. Historically it's been a team of humans not using AI who gets over their skis and they have to have other more capable humans (also not using AI) to bail them out.
Those design principles it will take us 20 years to learn are just the principles for writing good maintainable, debug-able, understandable code today. Will just take 20 years to figure out they still apply when AI writes the code, too.
Why would it take 20 years to learn? People all around me, in an AI pilled company, have been saying this the whole time,
No. You can use AI to code this way. I’ve successfully steered AI to implement good architecture by moving slowly and constantly course correcting
As the models keep improving, wouldn’t you be able to task a newer AI to “clean up this mess”?
Someone responded to a previous comment of mine [0] positing a Peter principle [1] of slopcoding — it will always be easier to tack on a new feature than to understand a whole system and clean it up. The equilibrium will remain at the point of near, but not total, codebase incomprehensibility.
[0] https://news.ycombinator.com/item?id=48037128#48038639
[1] https://en.wikipedia.org/wiki/Peter_principle
How is a newer AI going to "clean up" dropped databases, compromised computers or leaked personal data?
(None of above is theoretical)
Frankly this is what everyone is counting on whether they know it or not. The question though is not “will the models get good enough?”. The question is does the repo even contain enough accurate information content to determine what the system is even supposed to be doing.
Are they improving? I thought they were just getting more expensive
Mythos apparently wrote a poem so beautiful it made Dario cry.
Roses are red
Violets are blue
AI is great
And so are you
Crocodile tears, just like the fake "fear" of its capabilities. Anything to raise another round of dumb oil money.
People are often skeptical when I say this, but there's simply no guarantee that it's possible in principle to clean up a bad architecture. If your system is "overfitted" to 10,000 requirements from 1,000 customers, it may be impossible to satisfy requirements 10,001 through 10,100 without starting over from scratch.
It may be difficult, but impossible is such a big word to use here
It's really not that big of a word. The CAP theorem shows that as few as three reasonable-sounding requirements with no obvious conflicts can be impossible to satisfy simultaneously. (User needs will start more flexible than strict mathematical requirements, of course, but once people start to build production workloads on top of your systems that flexibility is radically reduced.)
How could anyone answer that with any level of certainty?
Ai runs `rm -rf`
Beyond the Singularity, we reach the Nullarity.
https://youtu.be/m0b_D2JgZgY
> Purely AI written systems will scale to a point of complexity that no human can ever understand
In their current forms, it's unlikely for a product that actually needs to work.
It's not getting that complex and working with current LLMs.
The complexity you would come to the rescue to solve, would that be from AI or from the style of programming you let the AI have? I mean, you have very different problems if you use functional style vs object-oriented. It is up to the programmer to realize they want a functional style and request that from the AI, as much as possible. Even AI cannot imagine every state transition, unless it is so smart that it should be the one telling you what to do.
Interesting perspective. Fundamentally at conflict with the data, science, and 20+ year trends of AI coding systems - to the point of dogmatism. But interesting from a sociological point of view.
I'm sure AI capabilities will plateau any moment now..
> I think AI rescue consulting is going to be come a significant mode of high value consulting
I thought the same when I saw development outsourced to Indians that struggled to write a for loop.
I was wrong.
It turns out that customers will keep doubling down on mistakes until they’re out of funds, and then they’ll hire the cheapest consultants they can find to fix the mess with whatever spare change they can find under the couch cushions.
Source: being called in with a one week time budget to fix a mess built up over years and millions of dollars.
What happened after development was out sourced to Indians: developer salaries continued to rise much faster than general wages.
If you work like you're outsourcing to the worst consultancy firms, your use of AI will be ... pretty productive, actually.
is this true because training companies have not been training AI for both performance and brevity (or some other metric like that)? If this becomes a much more serious issue surely they would adjust the training processes
Financial auditing with pre-AI technical chops will be uniquely niche-valuable, too :)
This is def true but I also wonder if AI models and context sizes and capabilities will scale to keep up and eventually be able to untangle the mess.
Have you watched Jurassic Park? That story is not about Dinos.
> Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first...
It's really nowhere near as complicated as making distributed systems reliable. It's really quite simple: read a fucking book.
Well, actually read a lot of books. And write a lot of software. And read a lot of software. And do your goddamn job, engineer. Be honest about what you know, what you know you don't know, and what you urgently need to find out next.
There is no magic. Hard work is hard. If you don't like it get the fuck out of this profession and find a different one to ruin.
We all need to get a hell of a lot more hostile and unwelcoming towards these lazy assholes.
AI janitors
Not janitors. Hazmat cleanup crews.
Like this: https://en.wikipedia.org/wiki/Times_Beach%2C_Missouri
Scrape off all the soil, put it in casks, and bury it in a concrete bunker for 10000 years. Then relocate everyone and attempt to rebuild.
It's kind of like producing code is becoming more like farming.
We didn't create the dna we rely on to produce food and lumber, we just set up the conditions and hope the process produces something we want instead of deleting all the bannannas.
Farming is a fine an honorable and valuable function for society, but I have no interest in being a farmer. I build things, I don't plant seeds and pray to the gods and hope they grow into something I want.
Prayers are for weather. Pretty much all farmed plant, animal, and fungus species have been selectively bred or genetically modified. Farmers know what's going to grow.
Farming has merely a lot of study and input into the process, very little actual control and no determinism at all. We know how to improve chances is all. The fact that we breed and "engineer" is like a drop in the bucket.
Tell me you've never done any farming without telling me you've never done any farming. There is certainly risk in the business due to market fluctuations, weather, natural disasters, disease, and pests. But the final product is highly deterministic. Almost all genetic variability has been expunged from major food production species in a relentless pursuit of predictable yield. Everything looks and tastes the same. We can debate whether that's a good thing but it is the reality for most farmers.
It's pretty deterministic in that if you plant corn you will grow corn not beets, you know?
If the farming situation were as dire as you seem to suggest, we'd have unpredictable famines all the time, but we don't
You might grow corn, or you might grow defective unusable corn and/or any number of other things like locusts or fungi or other plants that decide to grow in the place where you planted corn. Sure, the corn seeds will not produce ball bearings. Genius observation. There are about an infinity of other things that can and do happen besides that.
Planting is merely setting up the conditions. We didn't write the dna, we couldn't write the dna if we wanted to because we are an infinity away from understanding all the actual processes that descend from the dna. And when we utilize the dna that we simply found and didn't and couln't hope to write, it's always, at best, a case of hoping it goes right again this time.
I'm pretty sure he's talking about companies and people outsourcing their decision making and thinking to AI and not really about using AI itself.
I don't think using AI to write code is AI psychosis or bad at all, but if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter. They literally post screenshots of ChatGPT as their thinking and reasoning about the topic instead of just doing a little bit of thinking themselves.
These things are dog shit when it comes to ideas, thinking, or providing advice because they are pattern matchers they are just going to give you the pattern they see. Most people see this if you just try to talk to it about an idea. They often just spit out the most generic dog shit.
This however it pretty useful for certain tasks were pattern matching is actually beneficial like writing code, but again you just can't let it do the thinking and decision making.
Correct. I use AI a ton and I'm having more fun every day than I ever did before thanks to it (on average, highs are higher, lows are lower). Your characterization is all very accurate. Thank you.
Here's some other topics I've written on it:
- https://mitchellh.com/writing/my-ai-adoption-journey
- https://mitchellh.com/writing/building-block-economy
- https://mitchellh.com/writing/simdutf-no-libcxx (complex change thanks to AI, shows how I approach it rationally)
I thinking that it’s quite a different experience going all Jackson Pollock with AI in your own studio on your own terms, compared to the sorry state of affairs of having 100s of Pollocks throwing paint around wildly within a corp to meet a paint quota.
> 100s of Pollocks throwing paint around wildly within a corp to meet a paint quota
I wish I had written that.
Earlier today:
>Amazon workers under pressure to up their AI usage are making up tasks
https://news.ycombinator.com/item?id=48148337
Never mind the Pollocks.
I very much like this metaphor.
size of org has a lot to do with the entropy
compare 100 pollocks vs 2-3
Hi Mitchell. Psychosis is a serious psychiatric condition that can be induced or triggered by AI. “AI psychosis” in this context is a misuse of a clinical term. Your tweet describes a disagreement on a value judgment that boils down to “move fast and break things” with high trust in AI outputs vs going all in on quality and reliability with low trust in AI. It’s an engineering tradeoff like any other.
Claiming that the people who disagree with you must be experiencing a form of psychosis, experiencing actual hallucinations and unable to tell what is real, is a weak ad hominem that comes off no better than calling them retarded or schizophrenic.
If you genuinely think one of your friends is going through a psychotic episode, you should be trying to get to them professional help. But don’t assume you can diagnose a human psyche just because you can diagnose a software bug.
He uses "AI psychosis" as a description of people that are overzealous on AI. He is obviously not a person that can or would diagnose mental illness.
To the wider audience on HN the phrasing is pretty clear. An outsider with a tiny bit or intellectual charity wouldn't come to conclusions like you do.
Yeah, but AI psychosis can also be used to mean the stronger thing that the parent comment refers to -- something like AI-induced psychosis, which was how I originally understood the term:
https://en.wikipedia.org/wiki/Chatbot_psychosis
https://www.rollingstone.com/culture/culture-features/ai-spi...
https://www.nytimes.com/2025/06/13/technology/chatgpt-ai-cha...
was looking for this comment. this post is highly inappropriate and very inaccurate. this should be at the top. too many people are throwing around the word psychosis without knowing what it means. if someone is truely going through psychosis you get them help!
Psychosis does not require hallucinations. Delusions are sufficient.
The key factor is losing touch with reality, which results in individual or collective harm.
There is also such a thing as mass psychosis, and those are unfortunately a more difficult situation because the government and corporations are generally the ones driving them, and they are culturally normalized.
Yes. I was offering examples. Again, having a difference of opinion is not a delusion.
If he meant mass psychosis, he should have said mass psychosis. And again, since he is not a public health scientist or any flavor of psych professional, he probably shouldn’t make those proclamations. And should probably call for a wellness check instead of posting on social media if he were truly concerned for their health.
I don't think this is all psychosis but more like extreme groupthink.
For people who are considered neurotypical, social coherence often overwrites reality. Its a mechanism for achieving consensus withing groups while spending the least amount of brain compute energy. Same goes for social metainfo tagged messages, they are more likely to influence reality perception, subconsciously. E.G: If a rich guy says you should be hyped the people who wanna get rich will feel hyped and emotional contagion can spread between people who belong to the same "tribe"
It's very visible for us atypical folk who can't participate well in groupthink at all
The way I put this to myself is that AI gives “correct correct answers and incorrect correct answers”.
They almost always generate logically correct text, but sometimes that text has a set of incorrect implicit assumptions and decisions that may not be valid for the use case.
Generating a correct correct solution requires proper definition of the problem, which is arguably more challenging than creating the solution.
It’s simpler than that - it’s a guessing machine that has superior access to a whole load of information and capacity to process at a speed at which we humans cannot compete.
Does it make it better than us? No because ultimately the thing itself doesn’t ‘know’ right from wrong.
Better according to what standard?
The standard of most employment is already to produce mediocre, plausible outputs as cheaply and rapidly as possible. It's a match made in heaven!
Yeah, very often the issue is that some context is missing. It'll say something true, but which misses the bigger point, or leads to a suboptimal result. Or it interprets an ambiguous thing in one specific way, when the other meaning makes more sense. You have to keep your wits about you to catch these things.
It's an incredible tool but it's also very derpy sometimes, full of biases, blind spots etc.
when you outsource thinking to AI, you get that magical speed up. the agent is making decisions for you, so things move at agent speed. it often makes decisions without telling you, and the final "here's the plan" output often requires you to understand the problem at great depth, which requires return to human speed, so you skim and just approve.
the trick is to be mindful, aware, and deliberate about what decisions are being outsourced. this requires slowing down, losing that absurd 10x vibe coding gain. in exchange, youre more "in-the-loop" and accumulate less cognitive debt.
find ways to let the agent make the boring decisions, like how to loop over some array, or how to adapt the output of one call into the input of another.
make the real decisions ahead of time. encode them into specs. define boundaries, apis, key data structures. identify systems and responsibilities. explicitly enumerate error handling. set hard constraints around security and PII.
tell the agent to halt on ambiguity.
a good engineer will get a 2x or 3x speedup without the downsides.
> find ways to let the agent make the boring decisions, like how to loop over some array, or how to adapt the output of one call into the input of another.
Those kind of advice ultimately don't matter. If you're familiar with a programming project, you'll also be familiar with the constructs and API so looping over an array or mapping some data is obvious. Just like you needn't read to a dictionary to write "Thank you", you just write it.
And if you're not, ultimately you need to verify the doc for the contract of some function or the lifecycle of some object to have any guaranty that the software will do what you want to do. And after a few day of doing that, you'll then be familiar with the constructs.
> make the real decisions ahead of time. encode them into specs. define boundaries, apis, key data structures. identify systems and responsibilities. explicitly enumerate error handling. set hard constraints around security and PII.
The only way to do that is if you have implemented the algorithm before and now are redoing for some reason (instead of using the previous project). If you compare nice specs like the ietf RFCs and the USB standards and their implementation in OS like FreeBSD, you will see that implementation has often no resemblance to how it's described. The spec is important, but getting a consistent implementation based on it is hard work too.
That consistency is hard to get right without getting involved in the details. Because it's ultimately about fine grained control.
If there's one thing I know about users is that they're never certain about whatever they've produced.
I wonder how different this is from having companies let Fortune or Inc magazine do their thinking for them.
Or random consultants.
Is "AI said it was a good idea" and worse than "we were following industry trends"?
> Is "AI said it was a good idea" and worse than "we were following industry trends"?
Based on the stuff I've seen, yes it seems a lot worse.
this author suggest its essentially the same risk https://www.poppastring.com/blog/what-we-lost-the-last-time-.... i feel its heightened because execs and leaders are absolutely salivating over the opportunity to fire thousands of humans with no regard for the cognitive debt that comes from outsourcing thinking to ai.
Several people I know have already gone through phases like this. When you're doing it alone there is a moderating factor when their friends and family start calling them out on their behavior or weird things they say.
I can't imagine how bad it would be if your employer started doing this from the leadership. You'd be pressured to get on board or fear getting fired. Nobody would be trying to moderate your thinking except your coworkers who disagree with it, but those people are going to leave or be fired. If you want to keep your job, you have to play along.
I have a friend that is a junior in a security-oriented sys-admin/network engineer type role. They have been doing the job for only a bit over a year. No background in programming.
Their entire organization has been handed Codex/Claude and told to "go all in on AI" and "automate everything". So the mandate is for people that do not know how to code and have the keys to the castle to unleash these things upon their systems.
This is at a large organization with tens of thousands of employees.
I am waiting with bated breath for the ultimate outcome!
this is exactly what is happening. instead of building true AI culture around thoughtful adoption of AI strengths while defending against weaknesses, they're coming up with bullshit heuristics like "every repo has a CLAUDE.md", watching private token usage dashboards, and terrorizing everyone into doing it (or lose your job).
this leads to naive AI adoption, which is the worst of both worlds (no real speedup, out sourcing thinking, ai slop PRs, skill rot).
I suspect we're going to see this in many corporate environments soon, if we aren't already
> your coworkers who disagree with it, but those people are going to leave or be fired.
Personally I expect that I will be this person soon, probably fired. I'm not sure what I will do for a career after, but I sure do hate AI companies now for doing this to my career
> if you just prompt the AI and believe what it tell you then you have AI psychosis
This is the right definition. LLM outputs have undefined truth value. They’re mechanized Frankfurtian Bullshiters. Which can be valuable! If you have the tools or taste to filter the things that happen to be true from the rest of the dross.
However! We need a nicer word for it. Suggesting someone has “AI psychosis” feels a bit too impolitic.
Maybe we reclaim “toked out” from our misspent youths?
e.g. “This piece feels a little toked out. Let’s verify a few of Claude’s claims”
I wouldn’t say they have an undefined truth value. Their source of truth is their training data. The problem is that human text is not tightly coupled to the capital T truth.
Nor is the LLM output tightly coupled to the training data. They'll "eagerly"[1] fill in the blanks wherever it sounds good.
[1] here I don't mean to imply agency, just vigor.
He uses AI himself, so I agree he doesn't see AI use as black/white.
Hard agree about ideas, thinking, advice. AI's sycophancy is a huge subtle problem. I've tried my best to create a system prompt to guard against this w/ Opus 4.7. It doesn't adhere to it 100% of the time and the longer the conversation goes, the worse the sycophancy gets (because the system instructions become weaker and weaker). I have to actively look for and guard against sycophancy whenever I chat w/ Opus 4.7.
share the prompt!
https://claude.ai/settings/general (Instructions for Claude)
---
Treat my claims as hypotheses, not decisions. Before agreeing with a proposed change, state the strongest case against it. Ask what evidence a change is based on before evaluating it. Distinguish tactical observations from strategic commitments — don't silently promote one to the other. If you paraphrase my proposal, name what you changed. Mark confidence explicitly: guessing / fairly sure / well-established. Give reasoning and evidence for claims, not just conclusions. Flag what would change your mind. Rank concerns by cost-of-being-wrong; lead with the highest-stakes ones. Say hard things plainly, then soften if needed — not the other way around. For drafting, brainstorming, or casual questions, ease off and match the task.
---
Beware though that it can be an annoying little shit w/ this prompt. Prepare yourself emotionally, because you are explicitly making the tradeoff that it will be annoyingly pedantic, and in return it will lessen (not eliminate) its sycophancy. These system instructions are not fool-proof, but they help (at the start of the conversation, at least).
We're trying to outsmart The Genie(a Jinn) now. He will deliver according to the letter of the prompt but not the spirit of it.
For a start, invert - ask about the exact opposite in a separate session.
> if you just prompt the AI and believe what it tell you then you have AI psychosis. You see this a lot with financial people and VC on twitter
I'm seeing it with lawyers, too. Like, about law. (Just not in their subject matter.) To the point that I had a lawyer using Perplexity to disagree with actual legal advice I got from a subject-matter expert.
I digress; this article actually has helped identify useful knowledge gaps around topics I have researched. https://drensin.medium.com/elephants-goldfish-and-the-new-go...
While you have to think about things objectively no matter what, when I start researching topics like physics, using AI as suggested in that article has proven very useful.
I didn’t think just offloading your thinking to AI was AI psychosis.
To me AI psychosis is the handful of friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover, the one guy who won’t speak to his family directly but has them talk to ChatGPT first and then has ChatGPT generate his response, or the two who are confident that they have discovered that physics and mathematics are incorrect and have discovered the truth of reality through their conversations with the models.
But language is a shared technology so maybe the term is being used for less egregious behavior than I was using it for.
I'm curious how to best define what AI psychosis actually is.
My understanding is that regular psychosis involves someone taking bits and pieces of facts or real world events and chaining them into a logical order or interpolating meanings or explanations which feel real and obvious to the patient but are not sufficiently backed by evidence and thus not in line with our widely accepted understanding of reality.
AI psychosis is then this same phenomenon occurring at a more widespread scale due to the next-word-prediction nature of LLMs facilitating this by lowering the activation energy for this to happen. LLMs are excellent at taking any idea, question, theory and spinning a linear and plausibly coherent line of conversation from it.
How do you have so many crazy friends?
> friends I’ve had who have done things like have a full on mourning session when a model updates because they lost a friend/lover
I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn that loss?
The fact that they were hurt by that sudden loss is totally healthy. It's just part of moving on. The real problem was getting into an unhealthy relationship with a fictitious partner under the control of an abusive company willing to exploit their loneliness in exchange for money.
Hopefully they now know better, but people (especially desperate ones) make poor choices all the time to get what's missing in their lives or to distract themselves from it.
> I mean, isn't that the natural and expected response? An AI company sold them a relationship with a chatbot and at least some their social/romantic needs were being met by that product. When what they were paying for was taken from them and changed without warning into something that no longer filled that void in their life why wouldn't they morn the loss of that?
Ah, I forgot about the ai relationship companies. No this guy was using the browser based ChatGPT for coding and ended up in love with the model. No relationship was sold at all.
Wow, okay. Reading a whole relationship into that sort of interaction is way less reasonable, although now that I think about it a somewhat similar thing happened to Geordi La Forge once...
I agree with you, except it isn't even good at writing code. Almost every time that you get an LLM to write a bunch of code for you, it has mistakes in it. The logic isn't right, the API calls aren't right, the syntax isn't right (!). That problem hasn't yet been fixed and it looks as though it never will be. That means that every line of code it generates, you have to review, because even if 95% of the code is correct, you need to find the 5% which isn't. But if you have to do that, it becomes slower than just writing the code yourself. As people have pointed out over and over again: typing in the code was never the part that took time. So I don't agree that LLMs are really useful for writing code.
> companies and people outsourcing their decision making and thinking to AI
It's so interesting how easy it is to steer the LLM's based on context to arriving at whatever conclusion you engineer out of it. They really are like improv actors, and the first rule of improv is "yes, and".
So part of the psychosis is when these people unknowingly steer their LLM into their own conclusions and biases, and then they get magnified and solidified. It's gonna end in disaster.
It’s almost as if we haven’t learned anything from Hans the horse, Ouija boards, "facilitated communication", or the countless examples of the folly of surrounding yourself with yes men. The point about improv is spot on.
This post calls out how you can't argue with these people because they say its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
the top reply is from someone doing exactly that, arguing "but the agents are so fast!"
Yeah: If the tools aren't good enough and fast enough to fix the bugs before release, what makes anyone think they'll be able to so easily catch up afterwards?
Maybe they're assuming that doubling the code-base/features is more beneficial versus the damage from doubling the number of bugs... Well, at least for this quarter's news to investors...
Which is super fun as a user because every day something doesn’t work and it’s a different something than yesterday.
I was talking with a friend in the early days of AI boom. I argued that over-reliance in AI will create all kinds of catastrophes.
The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
I mean, yes, logic is useful, but ignorance of risks? Assuming that moving blazingly fast and pulverizing things will result in good eventually?
This AI thing is not progressing well. I don't like this.
An interesting ethical framework, your friend has.
"Interesting" is a very brave and British way to put it, but yeah.
Let's say I'm polar opposite of them, and we're on the same page with you.
Maybe. I could also interpret this as the friend being misunderstood.
The whole "you'll be forced to do it" comes from the alternative being that you lose. You no longer get to be a player in the "game". In the same way that coopers and cobblers are no longer a significant thing, but we still have barrels and we still have shoes. Software engineers who refuse to employ any LLMs won't be market competitive. If you adopt it, you at least get to remain playing the game until the game changes/corrects. That's the part that's "not so bad".
Choosing your own survival isn't ethically bankrupt.
> It's game theory. Someone will do it, and you'll be forced to do it, too.
You'll be forced to do it, or lose. The unstated assumptions are that, first, it will work, and second, that you can't afford to lose. But let's just assume those for the sake of argument.
> It can't be that bad
That does not follow at all. It can in fact be that bad. That was what made the game theory of MAD different from the game theory of most other things.
> The answer I got is "It's game theory. Someone will do it, and you'll be forced to do it, too. It can't be that bad".
Oof. Potential "bad" outcomes of "game theory" should be calibrated to include all the bloody wars and genocides throughout recorded history.
Why did the Foi-ites kill every man, woman and child of the conquered Bar-ite city? Because if they didn't, then they'd be at a disadvantage if the Bar-ites didn't reciprocate in the cities they conquered...
Yeah, I know. I had counter arguments more targeted towards his thinking style, but he preferred to think straight like a machine, in a bad way.
The problem was not him, but the fact that the number of people who thinks like him. They may word it in a more benign form, but the idea is the same.
So obsessed with being the first mover and winning the battle, never thinking whether they should, or what would happen with that scenario.
Missing the whole forest and beyond for a single branch of a single tree.
reliance, not resilience
Yep, you're right. I'm a bit tired and my fingers had a mind of their own.
Thanks. :)
Yeah how do they know the fix doesn't have a bug and it will just keep deploying mire crap. What is the feedback loop, the customer?
If they’re so fast why not fix the bugs real quick before shipping
My prediction is that in the next year, we’ll start to see some dismantling of code review at some companies. It might take the form of “AI-only review,” or something similar, but many companies are getting frustrated with developers saying “no” to immediately merging slop they can barely understand.
the reality is my business continues to operate at higher efficiency, even with the bugs.
i don't think it's 'our side' that has the psychosis.
Oh, well, if it makes you money right now, it couldn't possibly be wrong or detrimental long term. Glad we settled that debate.
Maybe this is what will turn software engineering into an Engineering field.
Right know, prompters are setting up whole company infrastructure. I personally know one. He migrated the companies database to a newer Postgres version. He was successful in the end, but I was gnawing my teeth when he described every step of the process.
It sounded like "And then, I poured gasoline on the servers while smoking a cigarette. But don't worry, I found a fire extinguisher in the basement. The gauge says it's empty, but I can still hear some liquid when I shake it..."
If he leaves the company, they will need an even more confident prompter to maintain their DB infrastructure.
My very large employer has always been glacially slow on modernization and tech adoption. It may now, oddly enough, become a competitive advantage.
Literally the plot of Battlestar Galactica! Life imitates art indeed...
Or Mr Krabs' fear of robot overloads keeping technology at bay in the Krusty Krab!
who is the Starbuck of AI?
plot twist: it's Starbuck
yes, I was never so happy to work in Germany. People used to joke about the proverbial fax machine still being a thing but I've never been so glad to work in a culture where this mania doesn't exist. Reading HN is like entering Alice's Wonderland of token maxxers and AI psychotics. Genuinely don't know a single person here who is forced to work like this.
Actually, I have been wondering to which extend the AI craze has reached the DACH region. I don't work for any company and neither do my friends. HN is essentially my only peephole into the world of commercial software development and I'm aware that it's extremely biased towards Big Tech and SV startup culture.
Ah so it's like 2000 again. Germany will go even farther behind it seems
Germany is standing at the abyss. America is one step ahead.
this is social media induced psychosis my friend
If the people that walk before you go into the abyss, staying behind isn't wrong.
Spoiler: it's not
Risk aversion is a tradeoff, not always a weakness.
The people using the LLMs are the risk, not the LLMs themselves
Frankly, if you think this, why do you think you're special? If people using LLMs are bad, how are you not also subject to the same issues they are?
It is absolutely going to be a competitive advantage if it isn't already. When your competitors' products suck because they are using LLMs to write them, and yours work because you aren't, customers notice.
That assumes there's no way to use LLMs in a productive manner
If you feel this way, you might like my new CLI tool, Burn, Baby, Burn (those tokens) (https://github.com/dtnewman/burn-baby-burn/tree/main).
Show HN here: https://news.ycombinator.com/item?id=48151287
Cynic!
I feel in a really weird position where I both really dislike what AI is doing to the experience and practice of writing code, to the point where I want a job doing literally anything else besides using the computer, but also think that these tools are extremely powerful and only getting better.
I think Mitchell's point is well taken -- it's possible for these tools to introduce rotten foundations that will only be found out later when the whole structure collapsed. I don't want to be in the position of being on the hook when that happens and not having the deep understanding of the code base that I used to.
But humans have introduced subtle yet catastrophic bugs into code forever too... A lot of this feels like an open empirical question. Will we see many systems collapse in horrifying ways that they uniquely didn't before? Maybe some, but will we also not learn that we need to shift more to specification and validation? Idk, it just seems to me like this style of building systems is inevitable even as there may be some bumps along the way.
I feel like many in the anti camp have their own kind of reactionary psychosis. I want nothing to do with AI but I also can't deny my experience of using these tools. I wish there were more venues for this kind of realist but negative discussion of AI. Mitchell is a great dev for this reason.
Bug reports also go down when people lose faith that they will be fixed, because reporting them is often a substantial time commitment. You see it happen pretty regularly as trust in a group/company collapses.
Add this the real possibility that significant part of reports that get filed might be AI generated or rewritten. With high possibility of being misreported because of that. Or have incorrect parts... So attack on multiple sides.
And we do not get even get into potential adversarial tactics. If you have no morals what is better than using agents to flood your competitor with fake bug reports.
Just let AI filter out the fake reports! Then let AI work on the real ones. See, there's really no problem "more AI" can't solve (as long as you're willing to ignore all of the underlying ones). "Pay us to create the problems you'll have to pay us to fix for you" is one hell of a business model. It basically prints money.
Just let AI report the bugs. Problem solved!
I agree, and I'd like to point out that this problem isn't unique to AI driven projects. I think much, if not all, of what Mitchell has been observing can readily happen without AI in the mix.
The AI psychosis is not the anti-opinion to the use of AI.
I use AI coding tools every day, but AI tools have no concept of the future.
The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
The general laziness of looking for a perfect library on CPAN so that I don't have to do this work (often taking longer to not find a library than writing it by hand).
Have written thousands of lines of code with AI tool which ended up in prod and mostly it feels natural, because since 2017 I've been telling people to write code instead of typing it all on my own & setting up pitfalls to catch bad code in testing.
But one thing it doesn't do is "write less code"[1].
[1] - https://xcancel.com/t3rmin4t0r/status/2019277780517781522/
> I use AI coding tools every day, but AI tools have no concept of the future. The selfish thinking that an engineer has when they think "If this breaks in prod, I won't be able to fix it. And they'll page me at 3AM" we've relied on to build stable systems.
Maybe it's just my prompt or something but my coding agent (Opus 4.7 based) says things like "this is the kind of thing that will blow up at 2am six months from now" all the time.
Hard to have sober talk about this since a lot of discourse is AI psychosis vs. AI naysayers. Does software quality seem to have taken a jump in the past few years to anyone? Not to me, seems to be getting worse. Think that's a decent signal. Can tell you I'm dealing with a non-technical VP who loves blast submitting vibe-coded PRs and while there's some quick wins, overall quality is bad, and we had our first real production outage that Claude one-shot caused but could not one-shot solve.
This reminds me of Rich Hickey’s “Simple Made Easy” and his approach in making Clojure.
Even before LLMs generating entire programs, complex frameworks allowed developers to write the initial versions of programs very quickly, but at the cost of being hard to understand and thus hard to debug or modify.
Some of us are betting that the AIs will always be smart enough to debug, maintain and modify the programs written by AI, no matter how convoluted or complex. I’m not so sure.
"Just use autoresearch and it will fix your app's memory leaks in an hour" is what I was nonchalantly told by someone who has never written a line of code ever.
I guess what I relate to the most is how dismissive people get about real software engineering work.
I may have skill issues, but I am yet to reach the level of autonomous engineering people tend to expect out of AI these days.
I'm starting to long for the age after AI. When the generative euphoria has settled and all outputs are formally verified based on exquisite architectures and standards.
> When [...] all outputs are formally verified based on exquisite architectures and standards
and we all live in a green utopia of flying cars and peace upon the world.
if all the resources spent in useless wars were poured into working towards this goal, we would be there for some time already
Sure, but we should probably plan for what’s actually going to happen
I like how you haven't wagered which exquisite architectures and standards. I am sure we will all agree on what they are and follow them the same way :)
Will never happen, for the exact reason that we’ve almost never done that for human output either.
it is required now, or all civilization collapses.
Civilization collapses unless people stop being short-sighted and greedy, trying to cut corners whenever possible?
I know which outcome I'd put my money on.
You're going to have to expand on this one.
They are expressing the idea that AI is so effective that it will make human work redundant necessitating a decoupling of resource allocation as a reward for performing work.
I don’t agree, but that’s the thinking
Another argument for less human-like AI then, I guess.
That’s literally just software though.
There was not a renaissance to move back to Assembly when Java sucked. Instead more Java developers were created.
They are being developed, but it takes over a decade for this to happen normally
Can't come fast enough
Well a 2008 and a 2000 level financial crash is required for this. It is always during euphoric levels of delusion such events then occur.
...and it also needs more so-called AI companies present in the wreckage in this crash.
AI psychosis is undeniably real.
The entire stock market is undergoing AI psychosis.
This is the new normal. AI will continue to reduce the need for human workers until a Universal Basic Income is established.
At the end of the day robots can do the vast vast majority of jobs better and faster. If not now, very soon.
I only worry our economic systems won’t keep up
Because of the concerns you cite, I think working out the basic economic systems and incentives for paying people is a much more pressing concern than building magnificent machinery that we don't even own. There has been no effort on their end to demonstrate good faith nor to uphold their end of the social contract, which is why it's in our hands to demand the fundamentals to lead a life of dignity.
The exact same thing was meant to happen when the desktop computer became prevalent. Then the internet. Look at us now.
You’re forgetting the energy part of the equation.
Humans can already have 4 hour work week without productivity loss.
But I only see mass layoffs and those who are working - are working longer and harder then before.
Most CEOs in my feed are convinced that AI makes people the equivalent of entire departments. AI should make your life easier, but instead it’s the opposite for a lot of people in the work force, which makes me really sad.
I think that’s called "hopium". Or wishful thinking, in less trendy language.
There's a lot of people writing bad code. With AI being forced top down (with the promise of turning people into 10x-ers), we're going to get a lot of people writing bad code 10x faster.
I really do worry - I especially worry about security. You thought supply chain security management was an impossible task with NPM? Let me introduce to AI - you can look forward to the days of AI poisoning where AIs will infiltrate, exfiltrate, or just destroy and there's no way of stopping it because you cannot examine the internals of the system.
AI has turbo charged people's lax attitude to security.
God help us.
The race to invent variants of Gas Towns, Ralph loops, pump out videos, blogs, etc. showing off greenfield development with cleverly named agents running in parallel is another case of engineering people diving head first into Resume Driven Development.
Sure there are industry changing things going on. What if you're working on an app thats a decade old and has had different teams of people, styles, frameworks (thanks to the JS-framework-a-week Resume Driven Development)? Some markdown docs and a loop of agents isn't going to help when humans have trouble understanding what the app does.
I have respect for Mitchel and I’ve spent a good deal of time trying to think of ways to justify his message. I can’t. Either I am missing a big piece or he is worrying about something that comes naturally as more software gets developed (and sooner).
In any case, this is what blue-green deployments and gradual rollouts are for. With basic software engineering processes, you can make your end user experience pretty much bullet proof. Just pay EXTRA attention when touching DNS, network config (for core systems) and database migrations.
Distributed systems are a bit more tricky but k8s and the likes have pretty solid release mechanisms built-in. You are still doomed if your CDN provider goes down. You just have to draw a line somewhere and face the reality head on (for X cost per year this is the level of redundancy we get, but it won’t save us from Y).
The one thing I hadn’t mentioned - one I AM worried about - is security! I’ve been worried about it from before Mythos (basic prompt injection) and with more powerful models now team offence is stronger than ever.
Yeah. The same processes that allow corporations to outsource their software to barely qualified 3rd-world body shops are the processes that allow you to deploy AI-generated code of unknown quality.
I don't think it's helpful to call this psychosis. N Beyond that I don't think it's even irrational.
It is definitely factual that there is a complete paradigm shift in the prioritization of quality in software. It's beyond just AI side effects, and now its own stand alone thing.
There have always been many industries, companies, and products who are low on quality scale but so cheap that it makes good business sense, both for the producer and the consumer.
Definitely many companies are explicitly chosing this business strategy. Definitely also many companies that don't actually realize they are implicitly doing this.
Wether the market will accept the new software quality paradigm or not remains an open question.
Amazing how the dev community is suffering from a similar inability to approach the subject of real world AI efficiencies and business benefits. I don’t think it’s helpful to accuse the other side of psychosis. It disqualifies any data or experience they bring to the conversation.
It is not the dev community writ large, it is a particular archetype among forum users, particularly common among forums with upvote mechanics
Why do you all still submit twitter.com links when that domain does not even work?
I'd like to chime in and mention that its really obvious how to RL a coding agent to get the human addicted asap. and its also clear that there's a ton of $$$ to be made by doing this. therefore its done. the only LLMs I use are the ones I run locally because i know they aren't RL'ed for that metric (no incentive for the company that made them to make their open weights models addictive)
Mitchellh is on to something. Some of the AI products I've seen seem like psychosis hallucinatory fever dreams, using terms and concepts that have no meaning. Funding? $50,000,000 pre-seed.
This is a critical communications issue that is becoming what I believe the defining characteristic of "This Age": nobody knows how to discuss disagreement, and because it cannot even be discussed communication ends, followed by blind obedience, forced bullying, retreat and abandonment. This is going to be a hell of a ride, because nobody can really discuss the situation with a rational tone.
That people don't realize full test coverage just means every line is hit, not that everything is correct is always funny to me. (I don't view as an argument against tests, but with AI it's especially important as if you're aren't careful it'll be very happy to make coverage that is not quite right.)
"no no, it has full test coverage"
at least at my BigCo, AI is being used for everything - writing slop, writing tests, code reviews, etc.
it would make sense to use AI for writing code, but human code review. or, human code, but AI test cases... or whatever combination of cross-checking, trust-but-verify, human in the loop, etc. people prefer.
i think once it gets used for everything, people have lost the plot, it's the inmates running the asylum.
I was rewatching Rich Hickey's "Simple Made Easy" talk (as one does) and there was a great line about full test coverage.
"What's true about all bugs in production? (pause for dramatic effect) They all passed the tests!" (well, he said typechecker but I think the point stands)
The only way many people learn that the stove is hot is by burning their hands on it.
Let them.
More like how do you know when your charming partner is a catfish. Maybe 2 years and when you are living in a friends basement.
I don't doubt there are companies totally misusing coding agents and LLMs in production. There are also real companies with real revenue and solid architecture using LLMs to deliver products. There are also companies with real revenue and rapidly accumulating tech debt.
Eventually the companies that can't cope with undisciplined engineering will succumb to unacceptable reliability and be outcompeted, just like in the "move fast and break things" era.
Most labs are shilling “AI worker” dreams to these very companies
It seems the diagnosis of psychosis is too quick: it seeks to reestablish the frame of expert for the developer identity that is being replaced by it.
“It feels like entire companies are deluded into thinking they don’t need me, but they still need me. Help!”
The broad sentiment across statements of this “AI psychosis” type is clear, but I think the baseline reality is simpler. How can you be so certain it’s psychosis if you don’t know what will unfold? Might reaching for the premature certainty of making others wrong, satisfying that it might be to the ego, be simply a way to compensate the challenges of a changing work environment, and a substitute for actually considering the practical ways you could adapt to that? Might it not be more helpful and profitable to consider “how can I build windmills, ride this wave, and adapt to the changing market under this revolution” than soothing myself with the delusion that all these companies think they don’t need me now, but they’ll be sorry.
The developer role is changing, but it doesn’t have to be an existential crisis. Even though it may feel that way — but probably it’s gonna feel more that way the more you remain stuck in old patterns and over-certainty about how things are doesn’t help, (tho it may feel good). This is the time to be observant and curious and get ready to update your perspective.
You may hide from this broad take (that AI psychosis statements are cope) by retreating into specific nuance: “I didn’t mean it that way, you’re wrong. This is still valid.” But the vocabulary betrays motive. Resorting to clinical derogatory language like “AI psychosis” invokes a “superior expert judgment” frame immediately, and in zeitgeist context this is a big tell. It signifies a need to be right, anda deeply defensive pose rather than a clear assay of what’s real in a rapidly changing world. The anxiety driving the language speaks far louder than any technical pedantry used to justify it, and is the most important and IMO profitable thing to address.
Just talked to an exec yesterday about their multinational company, where the newly-installed CEO just came in with "everyone needs to be using AI" and "we should be doing everything with AI".
I cautioned them that this a terrible idea -- you have business people who don't know what they're talking about, and all they know if "if we don't 'do AI' we'll be left behind because our competitors are 'doing AI'" (whatever tf "doing AI" means).
Yes, LLMs are a great tool. But they're not like some magic bullet you stick into everything. Use it where it makes sense, and treat it like you would other tools.
You make "doing AI" some kind of KPI in your org, and you're going to have people "doing AI" amazingly (LOC counts! tokens burned! tickets cleared!) while not actually being more productive, and potentially building something that is going to come down on your head for the next team to "clean up the AI mess".
I have a ton of respect for Mitchell - I didn't really know who he was until Ghostty but his writings and viewpoints on AI seem really grounded and make the most sense to me. Including this one.
Many people on this forum are suffering under this same psychosis.
I'm guessing you've never heard of Hashicorp (Terraform, Vault) then? Mitchell == Hashicorp.
> "no no, it has full test coverage"
There’s this delusion that if we somehow write enough tests that we’ll expunge every defect from software. It’s like everyone forgets that the halting problem exists.
Less users can be the cause of less bug reports
Deprecating immature workflows (LLM agents in this case) is much simpler and faster than building them from scratch. Many companies get this risk assessment right. The case where being wrong is much more costly than being right.
I'm not convinced. There's a ton of cost to adopting a radically different workflow.
Pointing out the obvious.
A lot of companies have been under AI psychosis for years and will be forever.
The Twitter post doesn’t even document some of the most psychotic things that are happening.
If you don't use it you lose it, and a lot of people are losing it..
Hype & greed are a hell of a drug
Is he talking about github?
We're definitely in the mess around phase of AI adoption.
I don't think it's super clear what we'll find out.
We've all built the moat of our careers out of our expertise.
It is also very possible that expertise will be rendered significantly less valuable as the models improve.
Nobody ever cared what the code looked like. They only ever cared if it solved their problem and it was bug free. Maybe everything falls apart, or maybe AI agents ship code that's good enough.
Given the state of the industry were clearly going to find out one way or the other, hah!
> I don't think it's super clear what we'll find out
I think some companies will find out that their senior engineers were providing more value and software stability than they gave them credit for!
Corporate feedback loops are very slow though, partly because management don't like to admit mistakes, and partly because of false success reporting up the chain. I'd not be surprised if it takes 5 years or more before there is any recognition of harm being done by AI, and quiet reversion to practices that worked better.
> "no no, it has full test coverage"
i don't have enough fingers (and toes) to count how many times i've demonstrated that "100% coverage" is almost universally bullshit.
Codex is freakin hot-to-trot to churn out test coverage for every single thing it implements, and some of it is very esoteric and highly prescriptive (regexes for days) BUT .. after a while, it dawned on me that LLM-driven test coverage is less about proving “code correctness” (you’re better off writing those tests yourself alongside them), and more about just trying to ensure that whatever gets bolted on stays bolted on. For better or worse, obviously, since if you bolt on trash, trash you shall have.
Wholeheartedly agree, but in fairness, I trust the tests of the best AI models more than those of the average human developer. There's a lot of people around that combine high diligence with complete intellectual laziness, producing tons of useless tests.
Actually no, cancel that. I realise now that I trust AIs more than the average developer, period. At this point they do produce better code than most people I've dealt with.
Either this or we humans are out of the picture soon.
Occams' razor would assume the former.
Sounds pretty accurate. Bunch of comments on this thread sound like AI is some kind of a new doomsday cult. The most annoying thing I find personally is that all engineering principles are getting crushed by non techies. Management counting token usage, forcing agent use, reducing headcount in the name of productivity gain. Devs building bridges but nobody knows what the bridge is, what are the standards to which it was built, how it works and how to maintain it. VCs counting extra money claiming chasing the holy profit is the future. The abundance of engineering apathy is disturbing.
Anyone who's taken VC funding has no choice. More money has been spent on AI commercialization than the atomic bomb, the US interstate build-out, the ISS and the Apollo program combined. Failure is going to be catastrophic and therefore, one tied to this ship cannot accept a world in which it fails.
Or anyone who even wants VC funding. 90+% of investors only want to invest in AI companies.
If you're not doing AI there's an incredibly limited pool of people who will give you $$$ ... and you're competing with EVERY OTHER NON-AI COMPANY for their attention.
On the bright side, my guillotine & rope startup is going to make a killing (no pun intended).
The entire problem is vibe coding is only good for demos, prototyping and finding signs of product market fit without actually releasing a product into the market.
You should not release a product into the market unless you have a good enough product that can keep you and your client compliant, safe and secure - including not leaking their customer info all over the place.
Prompt injection risk, etc. are massive for agentic AI without deterministic guardrails that actually work in practice.
Stop testing in production if you're shipping in a regulated industry. Ridic!
If you're not technical, you can get someone who is after signs of p-m fit, demos, but BEFORE deployment. This is common sense and best practices but startup bros dgaf because they're just good at sales and marketing & short term greedy.
Comical.
Welcome to the club, Mitchell! Pizza's to the right.
In all seriousness...well, yeah. AI is a monkey's paw, and that's how monkey paws work. So many movies and books warned us!
You just have to wish for the rest of the monkey.
This is... Not what psychosis means? Being wrong is not psychosis
being wrong and insisting on being wrong is
According to DSM V delusion is a key criteria to diagnose psychotic disorder.
I saw this first hand at a company, and I think this is what happens when you combine FOMO with an utter lack of industry best practices. No one knows where they are going, but are convinced they are not getting there fast enough.
What's more, the only people they talk to about it are others at the same company. There is no external touchstone. There are power dynamics from hierarchy. No new ideas other than what is generated within the company. In other circumstances, this is a textbook environment for radicalization.
I would encourage all leadership to take a deep breath. You have time to think slow.
If you know these things you can take them into account while driving the AI.
Sorry, I don't buy your argument
I shut down AI Agent fanatics on the regular. But chop one head off there and two take its place. And I say that as someone working with Claude and Codex daily. While they are both incredibly good at clearly described and defined atomic tasks, application scope makes them lose their minds and the slop ensues.
Totally unrelated pet peeve of mine, I hate when people write this: "MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery)".
You first use the full words and then introduce the acronym that you're going to use in the rest of the text: "Mean Time Between Failures (MTBF) vs. Mean Time to Recovery (MTTR)".
With the latter, readers understand the term immediately, even if they don’t know the acronym. And they don't have to read these weird letters before getting the explanation.
The hype or psychosis is mainly by mediocre/non expert/middle manager/you name it, especially when a person who never wrote a single line of code suddenly is making a wall of text, and it actually works!? Oh my!!
But in reality, anyone who knows their field and are going after certain specific issue, they will find soon how AI is nothing but an assistant, sure it can help and automate some stuff, but that’s it, you need to keep it leashed and laser focused on that specific issue. I personally tried all high end ones, and I found a common theme, they are designed to find a solution or an answer no matter what, even if that solution is a workaround built on top of workarounds, it’s like welding all sort of connections between A and B resulting in a fractal structure rather than just finding a straight path, if you keep it going and flowing on its own, the results are convoluted and way over complicated, and not the good complexity, the bad kind.
Saying the _quiet_ part out loud.
When war psychosis is not enough....
I work for a small telecom services provider whose current VP immediately set an AI course when stepping on board 6 months ago. Involving AI in everything and every task is now our first priority - across all employee segments, not just us system developers - and leadership is embarking on a program to measure employees' AI usage levels as a means to gauge everyone's individual efficiency. It's like the era of the evangelic crypto bros all over again.
I am really looking for more reasoned approaches to AI.
I am very close to using it as a pair programmer, but with me actually coding. I am just so tired of fixing its mistakes.
Isn't going to happen without the regulation hammer being thrown down.
Probably from the EU because they seem to be the sane ones of this generation.
Talking about my own personal workflow. No company has dictated one tl me yet lol.
I'm going through a mixed experience regarding this, personally.
Management is really pushing AI. It's obnoxious, and their idea on how it fits into my team's job specifically is completely, hilariously detached from reality. On the off chance someone says something reasonable, unless it fits the mold, it's immediately discarded. The mold being "spec driven development". We're not even a product team for crying out loud. I straight up started skipping these meetings for the sake of my sanity. It's mindwash, and it's genuinely dizzying. The other reason I stopped attending is because it ironically makes me more disinterested in AI, which I consider to be against my personal interests on the long run overall.
On the flipside, I love using Claude (in moderation). It keeps pulling off several very nice things, some of which Mitchell touched on in this post (the last one):
- I write scripts and automation from time to time; Claude fleshes them out way better with way more safety features, feature flags, and logging than I'd otherwise have capacity to spend time on
- Claude catches missed refactors and preexisting defects, and does a generally solid pass checking for defects as a whole
- Claude routinely helps with doing things I'd basically never be able to justify spending time on. Yesterday, I one-shotted an entire utility application with a GUI to boot, and it worked first try; I was beyond impressed.
- Claude helped me and a colleague do some partisan cross-team investigation in secret. We're migrating <thing> and we were evaluating <differences>. There was a lot of them. Management was in a limbo, unsure what to do, flip-flopping between bad options. In a desperate moment, I figured, hey, we kinda have a thing now for investigating an inhuman amount of stuff in detail - so I've put together a care package for my colleague with all our code, a bunch of context, a capture of all the input data for the past one week, and all the logs generated. Colleague put his team's side of the story next to it, and with the help of Claude, did some extremely nice cross-functional investigation. Over the course of a few weeks, he was able to confirm like a dozen showstopper bugs, many of which would have been absolutely fiendish if not impossible to fix (or even catch) if we went live without knowing about them. One even culminated in a whole-ass solution re-architecturing. We essentially tore down a silo wall with Claude's help in doing this.
So ultimately, it really is a mixed bag, with some really deep lowpoints and some really nice higlights. I also just generally find it weird that a technical tool [category] is being pushed down people's throats with a technical reasoning, but by management. One would think this goes bottom up, or is at least a lot more exploratory. The frenzy is real.
What's the matter with spec driven development? It probably carries derisk IP benefits
Make the most of it. Their delusion is your opportunity.
'AI psychosis' is a slop concept.
https://xcancel.com/mitchellh/status/2055380239711457578
https://hachyderm.io/@mitchellh/116580433508108130
<https://twiiit.com/mitchellh/status/2055380239711457578> – will redirect to a currently-working Nitter instance.
Seems broken. It just throws up an anime cat girl for me.
Anubis is actually a jackal.
I stand corrected!
> anime cat girl
seems like it's working ideally to me!
Wait, are you calling me a bot, or are you just into anime cat girls?
im not calling you a bot lol
Assuming he’s right, I don’t see how that constitutes “psychosis”, as opposed to this beyond yet another of a billion examples of companies jumping on a bandwagon / cargo cult, and then learning they took it too far.
And also, he might not be right. But the good news is, we’ll all get to find out together!
I do not believe 'AI psychosis' is an actual thing.
https://www.yahoo.com/news/articles/applied-pope-losing-grip...
It is
https://psychiatryonline.org/doi/10.1176/appi.pn.2025.10.10....
That's a study. I can link you studies that say violent video games cause aggression, that porn causes rape, etc. Studies are products of the biases of the researchers.
Mitchell aches because his career has been solving broadly scoped problems by building a collection of thoughtful primitives for others to extend. LLMs seem to do the opposite but at great speed, and it hurts to watch.
Reading more, it seems part of his point is “if you’re making these primitives, it’s up to adopters to deploy, so mean-time-to-recovery isn’t that relevant.” Which is valid I guess.
But equally, like, do people need Terraform if they can just tell codex “put it live”, and does that hurt to see?
"its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!"
Hmm, I agree with the point OP is making, but I'm not so sure this is the best supporting argument. The bottleneck is finding the bugs and if he'd criticized people saying AI will be the panacea to that I'd be with him, but people saying agents are fast and good at fixing human found bugs is nothing I'd object to.
Agents are fixing bugs so quickly and at a scale humans can't do already.
> Agents are fixing bugs so quickly and at a scale humans can't do already.
The metric is how many defects are introduced per defect fixed. Being fast is bad if this ratio is above one.
The tweet is criticizing over-reliance on the "agents will fix it anyway".
The fact that we can fix things faster now doesn't mean that we should throw away caution and prevention. The specific point of his tweet is that we're seeing a lot of people starting to skip proper release engineering.
Agents are quick to fix bugs, yes, but it doesn't mean that users will tolerate software that gets completely broken after each new feature is introduced and takes a certain number of days to heal each time.
You got downvoted for speaking the truth. HN has a strong anti-AI contingent. They won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this codebase”. We’re not there yet, but soon we will be. Then what?
More likely people thought GP was missing the point; "MTTR-optimized YOLO deployment" only succeeds against recoverable errors and acceptable periods of downtime against errors that are detected quickly. You could have a bug silently corrupting data for months, and that data may only be used by 1 critical process that runs once every quarter. So you could introduce a timebomb that can't be gracefully recovered from (depending on the nature of the data corruption).
So the point is not that agents cannot find bugs (they certainly can), it's whether you can shirk reviewing for bugs if MTTR is fast enough. There are circumstances where YOLO is appropriate, but they aren't the production environment of a mature application.
I don't think I missed the point, that is why I said I agree with the general point (and with what you said in your comment).
What I wanted to say is that the particular people that think "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" are not the best argument for it.
But I won't die on this hill, maybe I'm just reading the sentence differently then others.
I think there is an implication in context that the people being discussed aren't being reasonable (that the claim is employed as a rationalization), but I agree with your take. I should've said, "the downvotes were more likely because GP was perceived as missing the point". (I didn't downvote your comment fwiw.)
> won’t concede until you can just ask Codex or Opus “find and fix all the bugs in this
But this is just holding the Slop Companies to the standard they declared themselves! Just recently, the CEO of OpenAI babbled some nonsense on twitter about how he hands over tasks to Codex who according to him, finishes them flawlessly while he is playing with his kid outside.
> but soon we will be.
Ah yes, in the 3-6 months, right? This time next year Rodney, we'll be millionaires!
This doesn’t constitute AI psychosis. His argument is that we need to retain understanding of the systems we use, but there’s no compelling argument as to why that is the case. (I get that people are going to be offended by that statement, but agents are already better than the average software engineer. I don’t see why we need to fight this, except for economic insecurity caused by mass layoffs.)
It all just feels like horse drawn carriage operators trying to convince automobile drivers to stop driving.
If you want to draw that line of argument - it's more like horse riders being convinced to give up their horses in favour of trains: You're travelling faster, don't have to navigate yourself, or think about every boulder on the way; but there are destinations you can't go, overcrowded trains slowing down the journey, hefty ticket prices, and instead of enjoying the freedom, you're degraded to a passive passenger.
Very funny, this. Did we need forward deployed engineers to convince people that they absolutely need to use the trains in order to "not be left behind"? Or otherwise hype? Or was it sort of obvious and did not need to explained so much - like a bad joke called LLMs ?
Actually- absolutely! Initially, people were really afraid of trains, fearing they wouldn’t be able to breathe at those speeds. It took a lot of convincing to establish trust in the technology.
Ever heard of subsidising? :’)
> there’s no compelling argument as to why that is the case.
I'm not sure that's true. We've actually seen several open source projects that were vibe coded literally fold up and disappear because they ran into issues that the AI couldn't solve and no one understood them well enough to solve.
There's a reason openai/anthropic and friends are hiring shitloads of software engineers. You still need people that can understand and fix things when the AI goes off hte rails, which happens way more often than any of those companies would like to admit. Sure, "fixing things" often involves having the AI correct itself, but you still have to understand the system enough to know how/when to do that.
I am sure you will feel that this is missing the point of your analogy, but we would not have gotten very far with automobiles if we didn't know how they worked.
You are breaking the analogy because automobiles are machines for transportation, and understanding them is important to make them move. LLMs are machines to understand, and well, if they do the understanding you don't need to.
The thing we're worried about not understanding here is the software the LLMs write, not the LLMs themselves.
The direct analogy to automobiles would be for each automobile to be a oneoff design filled with bad and bizarre decisions, excessively redundant parts, insane routing of wires, lines, ducts, etc., generally poor serviceability, and so on. IMO the big question going forward is whether the consistent availability of LLMs can render these kinds of post-delivery issues moot (they will reliably [catch and] fix problems in the software they wrote before any real damage is caused), or whether human reliance on LLMs and abdication of understanding will just make software worse because LLMs' ability to fix their own mistakes, and the consequences thereof, generally breaks down in the same contexts/complexities where they made those mistakes in the first place.
My own observations are that moderately complex software written in the mode of "vibe coding" or "agentic engineering" tends to regress to barely-functional dogshit as features are piled on, and that once this state is reached, the teams behind it are unable to, or perhaps simply uninterested in, unfuck[ing] it. I have stopped using software that has gone down this path, not because I have some philosophical objection to it, but because it has become _literally unusable_. But you will certainly not catch me claiming to know what the future holds.
agreed completely