All this demonstrates how non-sticky all this tech really is. When your product is basically just an API call it’s trivial to just swap you out for someone else. As such it’s unclear what the prize at the end of the present race to the bottom is.
We swapped OpenAI out for Claude and it required updating about 15 lines of code. All these guys are just commodity to us. If next week there’s a better supplier of commodity AI we’ll spend an hour and swap to something else again. There’s zero loyalty here.
It's an ironic situation; logically what should be the moat are the models, costing hundreds of millions of investment cost to train and operate so it would make sense if we see different provider focusing in different directions.
But right now we have 3-5 top contenders that are so evenly matched that the de-facto sticking point is mostly the harness, ie. the collection of proven plugins/commands/tools/agent features that are tuned to the users personal workflow.
That's why the frontier LLM companies are now spending a lot more to license exclusive proprietary training data from private sources in order to gain a quality edge in certain business domains.
But those holding said proprietary data have figured out they’re holding the cards now and have gotten a lot smarter recently. Companies are being very careful about what gets used for inference vs what they allow to be used for training.
I don’t see the core models getting dramatically better from where they are now. We’ve clearly hit a plateau.
Really? I mean I see regularly as I'm coding how much better it could be simply by running obvious prompts for me.
When I use the planning mode and then code the success rate is much higher. When I ask it to work on specific isolated chunks of code with clear success/failure modes the success rate is again much higher.
Now imagine a world where it recognizes that from my simple throw away non specific prompt. If it was able to fire off 20 different prompts in quick succession it could easily cut my time spent in front of the screen by a third.
The patterns are obvious but they don't do that right now because it's a lot of compute.
We'll be looking at this time where there's a progress bar showing context space the way we look at the Turbo button.
Because the truth is to get the baseline I'm talking about is a finite amount of compute at a certain point.
That sounds like spin to me. If there were a clear "quality edge" in "certain business domains" stemming from "exclusive proprietary data", someone would have been exploiting it already using meat computers.
But no, businesses are dumb. They always have been. Existing businesses get disrupted by new ideas and new technology all the time. This very site is a temple to disruption!
Proprietary advantage is, 99.999% of the time, just structural advantage. You can't compete with Procter & Gamble because they already built their brands and factories and supply chains and you'd have to do all that from scratch while selling cheaper products as upstart value options. And there's not enough money in consumer junk to make that worth it.
But if you did have funding and wanted to beat them on first principles? Would you really start by training an LLM on what they're already doing? No, you'd throw money at a bunch of hackers from YC. Duh.
Frontier labs are paying the same constellation of firms offering proprietary data and access to experts in their fields to train LLMs.
They are neck-and-neck only because they are participating in the arms race. The only other way to keep up is mass-distillation, which could prove to be fragile (so far it seems to be sustainable).
Meh. I think there's basically no benefit shown so far to careful curation. That's where we've been in machine learning for three decades, after all. Also recognize that the Great Leap Forward of LLMs was when they got big enough to abandon that strategy and just slurp in the Library of All The Junk.
I think one needs to at least recognize the possibility that... there just isn't any more data for training. We've done it all. The models we have today have already distilled all of the output of human cleverness throughout history. If there's more data to be had, we need to make it the hard way.
Ok, maybe pretraining is now complete and solved. Next up: post-training, reinforcement learning, engineering RL environments for realistic problem solving, recording data online during use, then offline simulation of how it could have gone better and faster, distilling that into the next model etc. etc. There's still decades worth of progress to be made this way.
" There's still decades worth of progress to be made this way."
That's not true. Moreover the progress can slow to a crawl where it's barely noticeable. And in that world the humans continues to stay ahead - that's the magic of humans. To be aware of surroundings and adapt sufficiently whilst taking advantage of tools and leveraging them.
This is an interesting theoretical statement that does not survive a collision with reality. The long-tail expert RHLF training is effective. We have seen significant employment impact to call center employees. This does not mean its progress will be cheap or immediate.
The quality edge hasn't shown up yet. If this strategy actually works then the quality improvements will only become apparent in the next round of major LLM updates. There's a lot of valuable training data locked up behind corporate firewalls. But this is all somewhat speculative for now.
To stop this, I today put most of my Amazon Redshift research web-site behind a basic auth username/password wall.
It's all remains free, but you need to email me for a username and password.
If I put in time and effort to make content and OpenAI et al copy it and sell it through their LLM such that no one comes to me any more, then plainly it makes no sense for me to create that content; and then it would not exist for OpenAI to take, or for anyone else. We all lose.
CloudBolt’s survey also examined how respondents are migrating workloads off of VMware. Currently, 36 percent of participants said they migrated 1–24 percent of their environment off of VMware. Another 32 percent said that they have migrated 25–49 percent; 10 percent said that they’ve migrated 50–74 percent of workloads; and 2 percent have migrated 75 percent or more of workloads. Five percent of respondents said that they have not migrated from VMware at all.
Among migrated workloads, 72 percent moved to public cloud infrastructure as a service, followed by Microsoft’s Hyper-V/Azure stack (43 percent of respondents).
Overall, 86 percent of respondents “are actively reducing their VMware footprint,” CloudBolt’s report said.
It is easier to do in the cloud than it is to do with actual hardware though, because you'll need enough hardware to do the migration. There is a capital moat around that.
I feel like the company that can figure out how to 100% safely live migrate any VMWare workload to another "cheaper" solution, will do quite well.
In my case, I always use Opus 4.6 in my work, but quite often I get a 504 error, and that's quite annoying. I get errors like that with Gemini too. I can't estimate if I'd get a similar number of errors with ChatGPT, since I use it very infrequently.
But imagine that at some point one of the big 3 (OpenAI, Anthropic, Google) gets very high availability, while the others have very poor availability. Then people would switch to them, even if their models were a bit worse.
Now, OpenAI has been building like crazy, and contracting for future builds like crazy too. Google has very deep pockets, so they'll probably have enough compute to stay in the game. But I fear that Anthropic will not be able to match OpenAI and Google in terms of datacenter build, so it's only a matter of time (and not a lot of time) until they'll be in a pretty tight spot.
Okay, premise that AI really is 'intelligent' up to the point of business decisions.
So, this all then implies that 'intelligence' is then a commodity too?
Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.
We did this with muscles and memory previously. We invented writing and so those with really good memories became just like everyone else. Then we did it with muscles and the industrial revolution, and so really strong or endurant people became just like everyone else. Yes, many exceptions here, but they mostly prove the rule, I think.
Now it seems that really smart people we've made AI and so they're going to be like everyone else?
Neither does a pneumatic piston operate at all like a bicep nor does an accounting book operate at all like a hippocampus. But both have taken well enough of the load off both those tissues that you be crazy to use the biological specimen for 99% of the commercial applications.
A bicep and a piston both push and pull things, but an AI cannot do what a smart brain can, so I don’t think being smart will no longer have an advantage. I mean, someone has to prompt the AI after all. The mental ability to understand and direct them will be more important if anything.
Have you worked with the Claude agents a lot? They essentially prompt themselves! It's crazy.
My meaning is not so much that intelligence will go away as a useful trait to individuals. But more that it's utility to the economy will be a commodity, with grades and costs and functions. But again , I'm speculating out of my ass here.
In that, if you want cheap enough intelligence or expensive and good intelligence, you can just trade and sell and buy whatever you want. Really good stuff will be really expensive of course.
Like, you still need to learn to write and have that discipline to use writing in lieu of memory. And you still need to repair and build machines in lieu of muscles and have those skills. Similarly I think that you'll still need the skills to use AI and commoditized intelligence, whatever those are. Empathy maybe?
The way this thing "looks like a duck, swims like a duck, and quacks like a duck" has nothing to do with the way a real duck "looks like a duck, swims like a duck, and quacks like a duck".
Who cares, as long as the end results are close (or close enough for the uses they are put to)?
Besides, "has nothing to do with how the human brain works" is an overstatement.
"The term “predictive brain” depicts one of the most relevant concepts in cognitive neuroscience which emphasizes the importance of “looking into the future”, namely prediction, preparation, anticipation, prospection or expectations in various cognitive domains. Analogously, it has been suggested that predictive processing represents one of the fundamental principles of neural computations and that errors of prediction may be crucial for driving neural and cognitive processes as well as behavior."
But the end results aren’t actually close. That is why frontier LLMs don’t know you need to drive your car to the car wash (until they are inevitably fine-tuned on this specific failure mode). I don’t think there is much true generalization happening with these models - more a game of whack-a-mole all the way down.
>So, this all then implies that 'intelligence' is then a commodity too? Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.
This is obviously already the case with the intelligence level required to produce blog posts and article slop, generade coding agent quality code, do mid-level translations, and things like that...
We have basically 4 companies in the world one can seriously consider, and they all seem to heavily subsidise usage, so under normal market conditions not all of them are going to survive.
Ya, agreed. This makes me think that (long term) the ai race won’t be won on the merits of individual models, but on pricing — I think Google has a some strong advantages here because they know how to provide cheap compute, and they already have a ton of engineers doing similar things, so it’s a marginal cost for them instead of having to hire and maintain whole devoted teams.
AI consumes entire data centers of compute. You aren’t tucking a few racks into a corner of a data center, you are building entirely new ones. There will be whole devoted teams.
But Google already builds data centers. Will there really be devoted AI-datacenter teams? Or will they just expand the normal datacenter teams, and ask them to use GPUs/TPUs instead of CPUs?
> As such it’s unclear what the prize at the end of the present race to the bottom is.
It's a market worth many billions so the prize is a slice of that market. Perhaps it is just a commodity, but you can build a big company if you can take a big slice of that commodity e.g. by building a good product (claude code) on top of your commodity model.
The revenue slice is there, problem is though in a race to the bottom like we’re in now there isn’t much profit at the bottom. And these companies desperately need profit to justify the gigantic capital spend and depreciation title wave that’s on the horizon. There’s no clear way now things don’t just get really pretty quickly.
I have curated my youtube recommendations over the years. It knows my likes and dislikes very well. It knows about me a lot.
The same moat exists in interactions with Claude. Claude remembers so many of preferences. It knows that I work in Python and Pandas and starts writing code for that combination. It knows about what type of person I am and what kind of toys I want my nephews and nieces to play. These "facts" about the person are the moat now. Stackoverflow was a repository of "facts" about what worked and what didn't. Those facts or user chat sessions are now Anthropic's moat.
You are missing the correlations that Claude can derive across all these user sessions across all users. In Google analytics, when I visit a page and navigate around till I find what I was looking for or didn't find it, that session data is important for website owners how to optimize. Even in Google search results, when I think on 6th link and not the first, it sends a signal how to rearrange the results next time or even personalize. That same paradigm will be applicable here. This is network effects and personalization and ranking coming togther beautifully. Once Anthropic builds that moat, it will be irreplaceable. If not, ask all users to jump from Whatsapp to Telegram or Signal and see how difficult it is. When anthropic gives you the best answer without asking too much, the experience is 100x better.
The underlying technology is a thin layer of queryable knowledge/“memories” in between you and the llm, that in turn gets added to the context of your message to the llm. Likely RAG. It can be as simple as a agents.md that you give it permission to modify as needed. I really don’t think that they are correlating your “memories” with other people’s conversations. There is no way for the LLM to know what is or isn’t appropriate to share between sessions, at the moment. That functionality may exist in the future, but if you just export your preferences, it still works.
The moat - at this point in time - is really not as deep and wide as you are making it out to be. What you are imagining doesn’t exist yet. Indexing prior conversations is trivially easy at this point, you can do it locally using an api client right this moment.
Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.
In any case, my feeling is that we should have learned at this point not to keep our data in someone else’s walled garden.
> Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.
Because your location data, wifi name and etc hones in on the fact this is the same person as before. You are actually supporting my point than denying it.
just having strict control over context management in session is a nice differentiator. Shared tooling between desktop and cli and is nice too. they've differentiated enough.
> OpenAI, meanwhile, has been attempting to quell the backlash against its deal with the U.S. government, putting out a blog post claiming that “our tools will not be used to conduct domestic surveillance of U.S. persons,”
As a non-US person, that sounds far more concerning than no statement at all. Because if their tools weren't used for surveillance against Europeans they would have said so as a marketing message...
It's also meaningless because we know governments get around these "agreements" by buying data from third party companies that bought the data from OpenAI. The only way to stop this is to legislate it out of existence.
I wouldn't give them any free pass and just give up, its highly amoral and inhuman behavior. Modern form of racism but based on passport.
You have this one? You are subhuman, treated as such and you have very limited rights on our soil, we can do nasty things to you without any court, defense, or hope for fairness. You have that one? Please welcome back.
Sociopathic behavior. Then don't wonder why most of the world is again starting to hate US with passion. I don't mean countries where you already killed hundreds of thousands of civilians, I mean whole world. There isn't a single country out there currently even OK with US, thats more than 95% of the mankind. Why the fuck do you guys allow this? Its not even current gov, rather long term US tradition going back at least till 9/11.
> "We have these two red lines... Not allowing Anthropic's AI to perform mass surveillance of Americans, and prohibiting its AI from powering fully-autonomous weapons..."
Anthropic literally said the same, but seem to be getting positive PR.
The difference is that Anthropic actually dotted the i's and crossed the t's whereas OpenAI fell for the weaselwords and is now desperately trying to renegotiate.
OpenAI didn't fall for anything, they knew exactly what they were signing and went ahead anyway, then started gaslighting people about what they had signed.
For a lot of people (me included) the lack of integrity and the gaslighting is what has soured them on OpenAI, rather than them signing up to build surveillance and weaponry.
To non-US citizens, all AI companies are as dangerous as each other, OpenAI just really botched the optics here.
Executives are certainly capable of understanding moral/ethical concerns.
Around 2005, a Yale Psychology PhD candidate asked me to write a web-based survey instrument with various questions, some on complex but straight forward business questions (the controls) and others with moral/ethical aspects. Senior executives participated and they answered similarly to rank & file, often completing the entire survey much faster. What they didn't know -- we were tracking how long they spent on each question. Questions with moral/ethical concerns took senior executives relatively longer than the rank & file.
Late Addendum: Sorry that I don't recall the author/paper. The survey population spanned multiple industries representing many Fortune 500s, including huge tech companies. The survey was the same for everyone. The questions were story problems from business and law school case reports. The participating companies were anonymized on our end. We provided HR departments with survey link; only subject rank (not identity) was collected. Survey was voluntary, with informed consent according to IRB approval.
You would also need to control for the degree to which people had a stake in the outcome (ie., virtue signalling).
Since executives have to make decisions where choosing the moral option may impose an economic (or operational) cost, this requires thinking through the actual choice.
Morality for the "rank and file" is just a signalling issue: there's nothing to think through, the answer they are "supposed to choose" is the one they do so, at no cost to them.
"Rank and file" employees choosing to prioritize morality very, very frequently pay real costs for doing so - with a much larger personal impact than executives feel.
Only in very rare circumstances where the obvious answer and their procedural work dont align.
When making an operational decision that affects the direction of the business, morality is almost always a concern -- even at the level of "do our customers benefit from this vs., do we?" etc.
Where do you get the idea that those circumstances are "very rare"? Workers are being asked to break rules and do unethical things all the time, and you're pretty much guaranteed to pay a personal cost if you refuse.
Meanwhile morality is almost always one of least important factors when making operational decisions.
This study showed executives spent relatively more time on questions with moral/ethical concerns. Perhaps the control questions were more similar daily work and hence familiar, while there were fewer encounters with questions having moral/ethical concerns. Perhaps executives decided more care was required for these questions to ensure people were not hurt.
Getting back to the grandparent post, executives are certainly aware of situations with moral/ethical concerns and need not consult their barber to answer them.
It helps a lot that Claude is just better. Codex isn't BAD, and in some narrow technical ways might even be more capable, but I find Claude to be hands-down the best collaborator of all the AI models and it has never been close.
Interesting to hear! I've had completely opposite experience, with Claude having 5 minutes of peerless lucidity, followed by panicking, existential crisis, attempts to sabotage it's own tests and code, psyops targeted at making user doubt their computer, OS, memory... Plus it prompts every 15 seconds, with alternative being YOLO.
Meanwhile codex is ... boring. It keeps chugging on, asking for "please proceed" once in a while. No drama. Which is in complete contrast with ChatGPT the chatbot, that is a completely unusable, arrogant, unhelpful, and confrontational. How they made both from the same loaf I dunno.
I wish I could get Claude to stop every 15 seconds. There's a persistent bug in the state machine that causes it to miss esc/stop/ctrl-f and continue spending tokens when there's a long running background task or subagent. There's a lot of wasted tokens when it runs for 10, 15, 20 minutes and I can't stop it from running down the wrong rabbit hole.
The following is a dramatic reenactment of an observed behaviour /discl.
You are making tool X. It currently processes test dataset in 15 seconds. You ask claude code to implement some change. It modifies code, compiles, runs the test - the tool sits in a 100% CPU busyloop. Possible reactions on being told there is a busy loop:
"the program is processing large amount of data. This is normal operation. I will wait until it finishes. [sets wait timeout in 30 minutes]."
"this is certainly the result of using zig toolchain, musl libc malloc has known performance issues. Let me instead continue working on the plan."
"[checks env] There are performance issues when running in a virtual machine. This is a known problem."
"[kills the program]. Let me check if the issue existed previously. [git stash/checkout/build/run/stash pop]. Previous version did not have the issue. Maybe user has changed something in the code."
Bonus episode: since claude code "search" gadget is buggy, LLM often gets empty search results.
"The changes are gone! Maybe user delete the code? Let me restore last commited version [git checkout]. The function is still missing! Must be an issue with the system git. Let me read the repository directly."
(Unrelated but I'm really curious) The above comment got downvoted within few seconds of me pressing "reply". Is there some advanced hackernews reader software that allows such immediate reaction (via some in-notification controls)? Or is that builtin site reaction? Or a sign of a bot? Because the speed was uncanny.
There's this bug in Claude Desktop where a response will disappear on you. When you're busy doing many things at once, you'll go back to the chat, and you'll be all "wait, didn't I already do this?" It's maddening and makes you question your own sanity.
I switched from ChatGPT Plus to Gemini Pro instead of Claude, since I'm a hobbyist and appreciate having more than just text chat and coding assist with my subscription (image gen, video gen, etc are all nice to have).
At first I found the Gemini Code Assist to be absolutely terrible, bordering on unusable. It would mess up parameter order for function calls in simple 200 line Python. But then I found out about the "model router" which is a layer on top which dynamically routes requests between the flash and pro model. Disabling it and always using the pro model did wonders for my results.
There are however some pretty aggressive rate limits that reset every 24 hours. For me it's okay though. As a hobbyist I only use it about 2-3 hours per day at most anyway.
With Claude you just tell it to set up whatever it needs and you have a smooth access to everything. Mine uses Nanobanana for image generation, Sora for video, Gemini for supplementary image processing and so on. Setting up each one was 5-10 min of Claude’s work
With Gemini Pro on Antigravity you get a quota reset every 5 hours and access to Claude Opus 4.6. That's what I use at home and don't need anything else.
I've generally thought that but lately I've been finding that the main difference is Claude wants a lot more attention than codex (I only use the cli for either). codex isn't great at guessing what you want, but once you get used to its conversation style it's pretty good at just finishing things quietly and the main thing is context management seems to handle itself very well and I rarely even think about it in codex. To me they're just... different. Claude is a little easier to communicate with.
codex often speaks in very dense technical terms that I'm not familiar with and tends to use acronyms I've not encountered so there's a learning curve. It also often thinks I'm providing feedback when I'm just trying to understand what it just said. But it does give nice explanations once it understands that I'm just confused.
Can you expand on that. I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.
And I've been absolutely amazed with Codex. I started using that with version ChatGPT 5.3-Codex, and it was so much better than online ChatGPT 5.2, even sticking to single page apps which both can do. I don't have any way to measure the "smarts" for of the new 5.4, but it seems similar.
Anyways, I'll try to get Claude running if it's better in some significant way. I'm happy enough the the Codex GUI on MacOS, but that's just one of several things that could be different between them.
Codex is not bad, I think it is still useful. But I find that it takes things far too literally, and is generally less collaborative. It is a bit like working with a robot that makes no effort to understand why a user is asking for something.
Claude, IMO, is much better at empathizing with me as a user: It asks better questions, tries harder to understand WHY I'm trying to do something, and is more likely to tell me if there's a better way.
Both have plenty of flaws. Codex might be better if you want to set it loose on a well-defined problem and let it churn overnight. But if you want a back-and-forth collaboration, I find Claude far better.
> I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.
Not Claude Code specifically, but you can try the Claude Opus and Sonnet 4.6 models for free using Google Antigravity.
I’ve been juggling between ChatGPT, Claude and Gemini for the last couple of years, but ChatGPT has always been my main driver.
Recently did the full transition to Claude, the model is great, but what I really love is how they seem to have landed on a clear path for their GUI/ecosystem. The cowork feature fits my workflows really well and connecting enterprise apps, skills and plugins works really well.
Haven’t been this excited about AI since GPT 4o launched.
There’s a surge of demand for sure, but I’m not at all convinced that it’s at OpenAI’s expense. My bet is the non-swe folks caught wind the things got seriously good at a lot of boring office work, i.e. we’re seeing diffusion of AI into the wider economy.
Many people I know initially used ChatGPT for awhile. Then after awhile they went to Gemini. Again stuck with it for awhile. And now are dabbling with Claude.
Yep there really is no switching cost it seems.
People generally want something from a model and then leave. I think people are sub-consciously forming relationships with Tech firms such that they do not care about them, and its all about what the user themselves gets. Generally there is no attachment. There's some examples of psychotic stuff but that's thankfully the exception not the norm.
That's why Apple cares deeply about its brand - it doesn't want to fall into that group of firms.
I've largely found codex and claude code to be about the same however, codex tends to "think" harder and for longer which depending on the task, yields better results without too much steering.
On an unrelated note, UI is such a personal preference that it's impossible, beyond core pillars that have been studied for decades, to say one is better over the other. That being said, I like OpenAI's design system much better than Anthropic. OpenAI products (cli and chat ui) "feel" nice and consumer focused whereas Anthropic's products feel utilitarian and "designed for business".
I wonder if this is actually good for Anthropic. 2.5 million new customers sounds like good news for them, except these are mostly not paying customers. It seemed like they were positioning themselves to make money by selling coding agents with a subscription fee. If that free tier mostly exists to advertise their paid tiers, then this would be kind of a drag.
We are in this fascinating stage where tokens that are nominally entirely fungible at a roughly equivalent intelligence level; yet at the same time there is huge market segmentation and differentiation in the non-tangible aspects of those tokens.
It's is a fairly ridiculous conclusion to draw that these people are leaving ChatGPT because of their stance. I doubt OpenAI's actions play much role in the influx at all.
A couple of weeks ago, to huge numbers of people, ChatGPT was AI. The biggest public perception shift that will have come from the DoD/DoW spat will be how many people know that Claude exists at all, that they are being unreasonably punished by the government for taking a principled stance will benefit.
People have been made aware of a product, made aware that it's good enough that the government wants to use it. They have then been shown a archetypical underdog against the government narrative. That makes almost a perfect storm for gaining customers.
When they actually use the thing and discover that it actually is good, They will stay, and they will tell their friends.
At this rate they should be sending Hegseth a thank you card.
Well c-suite lied to everyone and was dealing in bad faith. When thay came to light they immediatly lost support and interest from highly skilled researchers in the area, from that point onward, their only additional offerings would be whatever tech evangelists can rustle up, so nice ui some cool features etc. But really ground breaking stuff that takes celever engineering or the kind of thinking that cannot be taught/approximated? Gone. So OpenAI, despite its massive headstart, will just continue to fall behind. When youre smart enough money stops mattering beyond keeping a roof over your head and food in your mouth. At that point your world view and personal beleifs become far more valuable, and smart people always come to the conclusion violence is never worth it, ve it physical, information based, social, meotional, whatever. OpenAI is an incredibly violent company, so inherently scares off talent.
I was paying both $200+/mo and I went down to only paying Anthropic $200/mo.
My experience has, for a few months, been that OpenAI's models are consistently quite noticeably better for me, and so my Codex CLI usage had been probably 5x as much as my Claude Code usage. So it's a major bummer to have cancelled, but I don't have it in me to keep giving them money.
I'd love to get off Anthropic too, despite the admirable stance they took, the whole deal made me extra uncomfortable that they were ever a defense contractor (war contractor?) to begin with.
I left the openai platform long before this, because I expected things like this. A few called me alarmist but are now also jumping ship because of this. OpenAI has zero moral or ethical substance and people _do_ care about that. I'm extreme enough that joining openAI after a specific date works against you and your CV, not with/for you, while leaving at a specific date speaks volumes in favour of you. People are the sum of their actions, not their words and siding with / continuing to use openAI speaks volumes on who you are.
Is there any news about how Gemini fares in this debate? I suppose they're fine with total mass surveillance ("we already do that anyway") and creating kill bots but is there any official stance? I find it hard to believe Alphabet would not make US government contracts.
Didn’t Anthropic hire the infrastructure head from stripe and give him a CTO title? I would’ve thought that would help bring stability but if anything, things have become worse.
It's funny how the false choice of American politics (Red vs Blue) also makes it into its consumerist corporatist life. That Anthropic's threadbare "limits" on government usage are seen as a heroic stand is a testament to just how far the goalposts on "ethical" deployment of AI have moved to the (fascist) right. As ever, politics precedes technology. We have Reagan's internet, we will have Trump's AI. God help us.
I'm not sure what the message this comment is trying to convey beyond throwing in the "corporatist" "consumerist" signalling buzzwords followed by calling the right fascists.
I've literally never heard anybody call the Internet "Reagan's internet", the best I can do is the Al Gore quote and who's calling anything Trump's AI?
These are the points I'm making, which I think are fairly one-to-one with my original comment:
- American politics presents a false choice between Democrats and Republicans
- America is both a consumerist and corporatist society
- Anthropic asked for minimal limits on AI usage
- People view Anthropic's stand as heroic, while viewing OpenAI as villainous
- The false choice between Anthropic and OpenAI mirrors the false choice in
American politics.
- People at OpenAI, Anthropic, and elsewhere used to view ethical deployment of
AI as paramount, but those goalposts have shifted as financial and political
incentives changed.
- Specifically, the ethics of AI have become conveniently synonymous with the
current financial and political moment.
- The current political moment is fascist.
- Technology is broadly neutral and it is politics that primarily dictates how
technology is actually used and deployed, and therefore its broad impacts.
- The internet was developed in the neoliberal era, which began with the
election of Ronald Reagan and extended through the Obama presidency.
- The structure and dynamics of the internet over the last 30 years is more
reflective of neoliberal politics than it is of anything inherent in the
technology. Extreme privitization and the refusal to use public institutions
to provision or regulate public goods.
- AI is being developed in a new political era, begun with the first Trump
presidency, and taking more full shape under the second Trump presidency.
- We are likely to find that AI's trajectory is similarly dictated largely by
politics rather than anything inherent to the technology.
- With this political era being fascist and explicitly
neo-imperial/neo-colonial, I fear for the technology's impact on humanity.
I really enjoyed using Claude but the ever changing limits, weird policies (limited to Claude Code, you can't run Openclaw, etc) made switching a very easy choice.
OpenAI simply provides more value for the money at the moment.
You're totally allowed to use Claude for OpenClaw and you're totally able to use Claude Code with non-Anthropic models. You must be referring to the fact that you have to use an API key and cannot use the auth intended for Claude-only products, which AFAIK is the same at every AI company (with Google destroying whole Google accounts for offenders most recently).
Used codex cli (5.4) for the first time (had never used codex or gpt for coding before - was using Opus 4.5 for everything), and it seems quite good. One thing I like is it's very focused on tests. Like it will just start setting up units tests for specs without you asking (whereas Opus would never do that unless you asked)-- I like that and think it's generally good. One thing I don't like about GPT though is it pauses too much throughout tasks where the immediate plan and also the more outward plan are all extremely well defined already in agents.md, but it still pauses too much between tasks saying, next logical task is X, and I say yeah go ahead, instead of it just proceeding to the next task which Id rather it do. I suppose that is a preference that should be put in some document? (agents.md?)
well I have a running model (ha!) in my head about the frontier providers thats roughly like this:
- chatgpt is kinda autistic and must follow procedures no matter what and writes like some bland soulless but kinda correct style. great at research, horrible at creativity, slow at getting things done but at least getting there. good architect, mid builder, horrible designer/writer.
- claude is the sensitive diva that is able to really produce elegant code but has to be reminded of correctness checks and quality gates repeatedly, so it arrives at something good very fast (sometimes oneshot) but then loses time for correction loops and "those details". great overall balance, but permanent helicoptering needed or else it derails into weird loops.
- grok is the maker, super fast and on target, but doesn't think deeply as the others, its entirely goal/achievement focussed and does just enough things to get there. uniqiely it doesn't argue or self-monologue constantly about doubts or safety or ethics, but drives forward where other stuggles, and faster than others. cannot conenctrate for too long, but delivers fast. tons of quick edits? grok it is. "experimental" stuff that is not safe talking about... definitely grok.
- gemini is whatever you quickly need in your GSuite, plus looking at what others are doing and helping out with a sometimes different perspective, but beyond that worse than all the others on top.
- kimi: currently using it on the side, not bad at all so far, but also nothing distinct I crystallized in my head.
Tried using 5.4 xhigh/codex yesterday with very narrow direction to write bazel rules for something. This is a pretty boiler-plate-y task with specific requirements. All it had to do was produce a normal rule set s.t. one could write declarative statements to use them just like any other language integration. It gave back a dumpsterfire, just shoehorning specific imperative build scripts into starlark. Asked opus 4.6 and got a normal sane ruleset.
5.4 seems terrible at anything that's even somewhat out-of-distribution.
I got it to build a stereoscopic Metal raytracing renderer of a tesseract for the Vision Pro in less than half a day.
It surprisingly went at it progressively, starting with a basic CPU renderer, all the way to a basic special-purpose Metal shader. Now it’s trying its teeth at adding passthrough support. YMMV.
The limits are what did it for me. They kept boasting about Opus performance and improvements, practically begging me to try it out, and when I did, it totally obliterated my usage. I'm sure its good, but I stick to Sonnet because I've been burned bad. Never had that problem with ChatGPT, but it turns out they're just unprincipled and evil, which is a shame.
I tend to use LLMs more for research then actual coding, so I ended up going with GPT over Claude because it's chat interface just seems to work better for me. It balances out Claude being slightly better at software tasks.
Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.
because gemini, despite what stats say, still produces garbage once the problem gets harder. it nails it for lab conditions, but messy reality or creativity or even code quality is a far cry from opus or the latest gpt5.4 by a long shot. and always has been. its pretty good inside the GSuite because of integrations, but standalone its near worthless compared to even grok-code-fast which doesn't think much at all (but damn it is fast). At this point google keeps throwing noodlepots with AI against every wall in reach to see what sticks, which is more kind of desperation that still works to increase wall street highscores, but not exactly a streak or breakthrough. just rapid fire shotgun launches to see if anything sticks. No one serious talks Gemini because its not even worth considering still for real things outside shiny presentations and artificial benchmarks.
Gemini schools the other two when doing code reviews.
I used to think tokens are a commodity, but it’s becoming clear that the jagged frontier is different enough even for the easiest use case of SWE that there’s room for having two if not three providers of different foundational models. It isn’t a winner takes all, they’re all winning together. Cursor isn’t properly taking advantage of the situation yet.
My experience exactly. The more "real" the problems become, the more other models become unsuitable when compared to claude, with the sole exceptions being deepseek/kimi, which while speaking strictly w.r.t metrics and basic tasks are not better, they are more interesting and handle more odd and totally out of domain stuff better than the US models. An example being code i wrote for a hypercomplex sedenion based artififial neural network broke claude so bad it start saying it is chatgpt and cant evaluate/run code. similar experience for all US models, which are characterized by being extremely brittle at the fringes, though cladue least among them. Meanwhile chinese models are less capable for cookie cutter stuff but keep swinging when things get really weird and unusual. It's like US models optimize for the lowest minima acheivable, and god help you if distribution changes. Chinese models on the otgerhand seem to optimize for the flattest minima, giving poorer quality across the board but far more robust behaviour.
What a baffling comment. Aren’t you aware of why this exodus is happening? (It’s not related to “value for the money”!) What are your feelings on that part?
Whatever Anthropic might or might not do with the department of war interests me in proportion to how much I can influence this. Rounded, speaking as an European citizen, that appears to be exactly 0 to me.
ever tried living while simultaneously deciding to only patron groups that strictly morally and ethically align to your own personal beliefs?
I would love to, but a practical look at that concept seems practically impossible.
My .02c : Claude was already involved in underhanded shit I don't want a part of[0] and that generated little ethical response from Anthropic , i've had better luck as a 200/mo tier customer with ChatGPT, and I don't really think that Dario claiming that their newest LLM is conscious[1] on a market schedule is all that ethical, either.
Why paint the choice as black and white? Most people are doing the best they can morally even if they don't get it 100% right. Even living 60% in accordance with your values is better than 50%. Likewise, bucketing organizations as good or bad misses the same nuance. Choosing something that is slightly better is has positive consequences despite it not being 100% good.
not the poster, but I guess thats kinda american thinking that actually believes voting with your wallet will make any difference in this late stage crony capitalism in a post-facts world.
realistically: AI WILL get used in military and for killing autonomously, like it or not, believe it or not. I am also against that in principle but I do accept the fact my opinion just doesn't matter and practice radial acceptance or reality as-is. twitter/X is also alive and kicking, despite musk and anti-musk-hate. xAI/Grok is genuinely really good too compared to OAI/Claude, a bit different but very good. At this point all the "outcries" feel like noise I just skip on principle. But it could turn up the fire under the OAI team to go aggressive feature/pricing wise in order to retain/increase their userbase again, which is ... good, after all.
If anyone thinks Anthropic or OpenAI are the "good guys," they've already lost the plot. If you look at additional reporting on the topic, not just the Anthropic PR spin, the disagreements were much more nuanced than it was portrayed by Anthropic. They aren't exactly a reliable narrator on the topic either. In fact it actually just seems like Amodei fumbled the deal and crashed out a bit. He's already walked back his internal memo, and is reportedly still seeking a deal with the Pentagon. I don't trust either CEO, I use their products, but if you're even leaning 51-49 on who is "less evil," I think you're giving too much slack.
All this demonstrates how non-sticky all this tech really is. When your product is basically just an API call it’s trivial to just swap you out for someone else. As such it’s unclear what the prize at the end of the present race to the bottom is.
We swapped OpenAI out for Claude and it required updating about 15 lines of code. All these guys are just commodity to us. If next week there’s a better supplier of commodity AI we’ll spend an hour and swap to something else again. There’s zero loyalty here.
It's an ironic situation; logically what should be the moat are the models, costing hundreds of millions of investment cost to train and operate so it would make sense if we see different provider focusing in different directions.
But right now we have 3-5 top contenders that are so evenly matched that the de-facto sticking point is mostly the harness, ie. the collection of proven plugins/commands/tools/agent features that are tuned to the users personal workflow.
> are so evenly matched
It's because the real value of the models is in what we (humanity) fed them, and all of them have eaten the same thing for free.
To be more precise, they all stole the same stuff. I have no empathy for these crooks.
That's why the frontier LLM companies are now spending a lot more to license exclusive proprietary training data from private sources in order to gain a quality edge in certain business domains.
But those holding said proprietary data have figured out they’re holding the cards now and have gotten a lot smarter recently. Companies are being very careful about what gets used for inference vs what they allow to be used for training.
I don’t see the core models getting dramatically better from where they are now. We’ve clearly hit a plateau.
Really? I mean I see regularly as I'm coding how much better it could be simply by running obvious prompts for me.
When I use the planning mode and then code the success rate is much higher. When I ask it to work on specific isolated chunks of code with clear success/failure modes the success rate is again much higher.
Now imagine a world where it recognizes that from my simple throw away non specific prompt. If it was able to fire off 20 different prompts in quick succession it could easily cut my time spent in front of the screen by a third.
The patterns are obvious but they don't do that right now because it's a lot of compute.
We'll be looking at this time where there's a progress bar showing context space the way we look at the Turbo button.
Because the truth is to get the baseline I'm talking about is a finite amount of compute at a certain point.
so can it be the one that gets ahead on having people go find things for them - https://news.ycombinator.com/item?id=47285283
Interesting
That sounds like spin to me. If there were a clear "quality edge" in "certain business domains" stemming from "exclusive proprietary data", someone would have been exploiting it already using meat computers.
But no, businesses are dumb. They always have been. Existing businesses get disrupted by new ideas and new technology all the time. This very site is a temple to disruption!
Proprietary advantage is, 99.999% of the time, just structural advantage. You can't compete with Procter & Gamble because they already built their brands and factories and supply chains and you'd have to do all that from scratch while selling cheaper products as upstart value options. And there's not enough money in consumer junk to make that worth it.
But if you did have funding and wanted to beat them on first principles? Would you really start by training an LLM on what they're already doing? No, you'd throw money at a bunch of hackers from YC. Duh.
Frontier labs are paying the same constellation of firms offering proprietary data and access to experts in their fields to train LLMs.
They are neck-and-neck only because they are participating in the arms race. The only other way to keep up is mass-distillation, which could prove to be fragile (so far it seems to be sustainable).
Meh. I think there's basically no benefit shown so far to careful curation. That's where we've been in machine learning for three decades, after all. Also recognize that the Great Leap Forward of LLMs was when they got big enough to abandon that strategy and just slurp in the Library of All The Junk.
I think one needs to at least recognize the possibility that... there just isn't any more data for training. We've done it all. The models we have today have already distilled all of the output of human cleverness throughout history. If there's more data to be had, we need to make it the hard way.
Ok, maybe pretraining is now complete and solved. Next up: post-training, reinforcement learning, engineering RL environments for realistic problem solving, recording data online during use, then offline simulation of how it could have gone better and faster, distilling that into the next model etc. etc. There's still decades worth of progress to be made this way.
" There's still decades worth of progress to be made this way."
That's not true. Moreover the progress can slow to a crawl where it's barely noticeable. And in that world the humans continues to stay ahead - that's the magic of humans. To be aware of surroundings and adapt sufficiently whilst taking advantage of tools and leveraging them.
This is an interesting theoretical statement that does not survive a collision with reality. The long-tail expert RHLF training is effective. We have seen significant employment impact to call center employees. This does not mean its progress will be cheap or immediate.
I think this is where we are at, too.
But if you say stuff like this on here you get down voted. Why?
The quality edge hasn't shown up yet. If this strategy actually works then the quality improvements will only become apparent in the next round of major LLM updates. There's a lot of valuable training data locked up behind corporate firewalls. But this is all somewhat speculative for now.
To stop this, I today put most of my Amazon Redshift research web-site behind a basic auth username/password wall.
It's all remains free, but you need to email me for a username and password.
If I put in time and effort to make content and OpenAI et al copy it and sell it through their LLM such that no one comes to me any more, then plainly it makes no sense for me to create that content; and then it would not exist for OpenAI to take, or for anyone else. We all lose.
It seems parasitic.
An AI is more likely than me to take the time to send you an email for requesting access - I'm too lazy.
I think a better approach would be to have a login form and just say "the password is 1234" or whatever.
Virtually no scraper has logic to handle that sort of situation, but it's trivial for humans. Way easier than an LLM
Not true, even Windows Defender is capable of extracting "the password is 1234" from context like emails or webpages.
Please add Internet Archive's bot to your auto-allows, at least. Their bot is presumably well behaved, and for public benefit.
I'm about to ask IA to remove my content!
The reason is that I expect LLM bots to be crawling IA.
Ironic indeed. The Great Replacers of white collar jobs are finding themselves easily replaceable. Delicious.
Cost is never a good moat.
the companies migrating off vmware due to broadcom shittiness would disagree with you
https://arstechnica.com/information-technology/2026/02/most-...
CloudBolt’s survey also examined how respondents are migrating workloads off of VMware. Currently, 36 percent of participants said they migrated 1–24 percent of their environment off of VMware. Another 32 percent said that they have migrated 25–49 percent; 10 percent said that they’ve migrated 50–74 percent of workloads; and 2 percent have migrated 75 percent or more of workloads. Five percent of respondents said that they have not migrated from VMware at all.
Among migrated workloads, 72 percent moved to public cloud infrastructure as a service, followed by Microsoft’s Hyper-V/Azure stack (43 percent of respondents).
Overall, 86 percent of respondents “are actively reducing their VMware footprint,” CloudBolt’s report said.
It is easier to do in the cloud than it is to do with actual hardware though, because you'll need enough hardware to do the migration. There is a capital moat around that.
I feel like the company that can figure out how to 100% safely live migrate any VMWare workload to another "cheaper" solution, will do quite well.
The moat is compute.
In my case, I always use Opus 4.6 in my work, but quite often I get a 504 error, and that's quite annoying. I get errors like that with Gemini too. I can't estimate if I'd get a similar number of errors with ChatGPT, since I use it very infrequently.
But imagine that at some point one of the big 3 (OpenAI, Anthropic, Google) gets very high availability, while the others have very poor availability. Then people would switch to them, even if their models were a bit worse.
Now, OpenAI has been building like crazy, and contracting for future builds like crazy too. Google has very deep pockets, so they'll probably have enough compute to stay in the game. But I fear that Anthropic will not be able to match OpenAI and Google in terms of datacenter build, so it's only a matter of time (and not a lot of time) until they'll be in a pretty tight spot.
→All these guys are just commodity to us.
Just want to note something there:
Okay, premise that AI really is 'intelligent' up to the point of business decisions.
So, this all then implies that 'intelligence' is then a commodity too?
Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.
We did this with muscles and memory previously. We invented writing and so those with really good memories became just like everyone else. Then we did it with muscles and the industrial revolution, and so really strong or endurant people became just like everyone else. Yes, many exceptions here, but they mostly prove the rule, I think.
Now it seems that really smart people we've made AI and so they're going to be like everyone else?
Well as of right now, mathematically and scientifically, the way an LLM works has nothing to do with how the human brain works.
Neither does a pneumatic piston operate at all like a bicep nor does an accounting book operate at all like a hippocampus. But both have taken well enough of the load off both those tissues that you be crazy to use the biological specimen for 99% of the commercial applications.
A bicep and a piston both push and pull things, but an AI cannot do what a smart brain can, so I don’t think being smart will no longer have an advantage. I mean, someone has to prompt the AI after all. The mental ability to understand and direct them will be more important if anything.
Have you worked with the Claude agents a lot? They essentially prompt themselves! It's crazy.
My meaning is not so much that intelligence will go away as a useful trait to individuals. But more that it's utility to the economy will be a commodity, with grades and costs and functions. But again , I'm speculating out of my ass here.
In that, if you want cheap enough intelligence or expensive and good intelligence, you can just trade and sell and buy whatever you want. Really good stuff will be really expensive of course.
Like, you still need to learn to write and have that discipline to use writing in lieu of memory. And you still need to repair and build machines in lieu of muscles and have those skills. Similarly I think that you'll still need the skills to use AI and commoditized intelligence, whatever those are. Empathy maybe?
The way this thing "looks like a duck, swims like a duck, and quacks like a duck" has nothing to do with the way a real duck "looks like a duck, swims like a duck, and quacks like a duck".
Who cares, as long as the end results are close (or close enough for the uses they are put to)?
Besides, "has nothing to do with how the human brain works" is an overstatement.
"The term “predictive brain” depicts one of the most relevant concepts in cognitive neuroscience which emphasizes the importance of “looking into the future”, namely prediction, preparation, anticipation, prospection or expectations in various cognitive domains. Analogously, it has been suggested that predictive processing represents one of the fundamental principles of neural computations and that errors of prediction may be crucial for driving neural and cognitive processes as well as behavior."
https://pmc.ncbi.nlm.nih.gov/articles/PMC2904053/
https://maxplanckneuroscience.org/our-brain-is-a-prediction-...
But the end results aren’t actually close. That is why frontier LLMs don’t know you need to drive your car to the car wash (until they are inevitably fine-tuned on this specific failure mode). I don’t think there is much true generalization happening with these models - more a game of whack-a-mole all the way down.
The human doesn't just predict. It predicts based upon simulations that it runs. These LLMs do not work like this.
If you're able to predict, you're able to simulate.
So? Does a submarine swim?
>So, this all then implies that 'intelligence' is then a commodity too? Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.
This is obviously already the case with the intelligence level required to produce blog posts and article slop, generade coding agent quality code, do mid-level translations, and things like that...
> someone else
We have basically 4 companies in the world one can seriously consider, and they all seem to heavily subsidise usage, so under normal market conditions not all of them are going to survive.
Open Router shows that commodity api providers have figured out how to do this unsubsidized.
The training runs aren’t priced in, but the cost of inference is clearly pretty cheap.
Ya, agreed. This makes me think that (long term) the ai race won’t be won on the merits of individual models, but on pricing — I think Google has a some strong advantages here because they know how to provide cheap compute, and they already have a ton of engineers doing similar things, so it’s a marginal cost for them instead of having to hire and maintain whole devoted teams.
AI consumes entire data centers of compute. You aren’t tucking a few racks into a corner of a data center, you are building entirely new ones. There will be whole devoted teams.
But Google already builds data centers. Will there really be devoted AI-datacenter teams? Or will they just expand the normal datacenter teams, and ask them to use GPUs/TPUs instead of CPUs?
> As such it’s unclear what the prize at the end of the present race to the bottom is.
It's a market worth many billions so the prize is a slice of that market. Perhaps it is just a commodity, but you can build a big company if you can take a big slice of that commodity e.g. by building a good product (claude code) on top of your commodity model.
The revenue slice is there, problem is though in a race to the bottom like we’re in now there isn’t much profit at the bottom. And these companies desperately need profit to justify the gigantic capital spend and depreciation title wave that’s on the horizon. There’s no clear way now things don’t just get really pretty quickly.
The entire point of a race to the bottom is that your competitors keep reducing their prices until those billions disappear
Unfortunately this is why Anthropic is so aggressive about preventing Claude subscriptions from being used with other tools.
According to this article, they can't even service the amount of paying customers that they have.
They should put their prices up then.
Let me explain a possible moat with an example.
I have curated my youtube recommendations over the years. It knows my likes and dislikes very well. It knows about me a lot.
The same moat exists in interactions with Claude. Claude remembers so many of preferences. It knows that I work in Python and Pandas and starts writing code for that combination. It knows about what type of person I am and what kind of toys I want my nephews and nieces to play. These "facts" about the person are the moat now. Stackoverflow was a repository of "facts" about what worked and what didn't. Those facts or user chat sessions are now Anthropic's moat.
It takes about 30 seconds to export all that into a file and take your history elsewhere. There’s no moat there.
“Hey Claude, write out a markdown file of all of my preferences so any AI agent can pick up where you left off”
In fact, here, I'll do it myself.
You are missing the correlations that Claude can derive across all these user sessions across all users. In Google analytics, when I visit a page and navigate around till I find what I was looking for or didn't find it, that session data is important for website owners how to optimize. Even in Google search results, when I think on 6th link and not the first, it sends a signal how to rearrange the results next time or even personalize. That same paradigm will be applicable here. This is network effects and personalization and ranking coming togther beautifully. Once Anthropic builds that moat, it will be irreplaceable. If not, ask all users to jump from Whatsapp to Telegram or Signal and see how difficult it is. When anthropic gives you the best answer without asking too much, the experience is 100x better.
The underlying technology is a thin layer of queryable knowledge/“memories” in between you and the llm, that in turn gets added to the context of your message to the llm. Likely RAG. It can be as simple as a agents.md that you give it permission to modify as needed. I really don’t think that they are correlating your “memories” with other people’s conversations. There is no way for the LLM to know what is or isn’t appropriate to share between sessions, at the moment. That functionality may exist in the future, but if you just export your preferences, it still works.
The moat - at this point in time - is really not as deep and wide as you are making it out to be. What you are imagining doesn’t exist yet. Indexing prior conversations is trivially easy at this point, you can do it locally using an api client right this moment.
Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.
In any case, my feeling is that we should have learned at this point not to keep our data in someone else’s walled garden.
> Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.
Because your location data, wifi name and etc hones in on the fact this is the same person as before. You are actually supporting my point than denying it.
You can have Claude write all these out to a file.
Then you can feed them into another service.
> As such it’s unclear what the prize at the end of the present race to the bottom is.
is it ever clear? pretty much everything seems to be a senseless race to bottom.
This is the new web hosting. All the valuations are absurd
Doesn't "web hosting" print money for Amazon?
just having strict control over context management in session is a nice differentiator. Shared tooling between desktop and cli and is nice too. they've differentiated enough.
> OpenAI, meanwhile, has been attempting to quell the backlash against its deal with the U.S. government, putting out a blog post claiming that “our tools will not be used to conduct domestic surveillance of U.S. persons,”
As a non-US person, that sounds far more concerning than no statement at all. Because if their tools weren't used for surveillance against Europeans they would have said so as a marketing message...
With n-eyes agreements it’s quite meaningless anyway. Whatever passport you have, somebody spies on you and sells the information to your government.
It's also meaningless because we know governments get around these "agreements" by buying data from third party companies that bought the data from OpenAI. The only way to stop this is to legislate it out of existence.
Yep. “You spy on mine, and I’ll spy on yours, and then we’ll share info.”
I wouldn't give them any free pass and just give up, its highly amoral and inhuman behavior. Modern form of racism but based on passport.
You have this one? You are subhuman, treated as such and you have very limited rights on our soil, we can do nasty things to you without any court, defense, or hope for fairness. You have that one? Please welcome back.
Sociopathic behavior. Then don't wonder why most of the world is again starting to hate US with passion. I don't mean countries where you already killed hundreds of thousands of civilians, I mean whole world. There isn't a single country out there currently even OK with US, thats more than 95% of the mankind. Why the fuck do you guys allow this? Its not even current gov, rather long term US tradition going back at least till 9/11.
> "We have these two red lines... Not allowing Anthropic's AI to perform mass surveillance of Americans, and prohibiting its AI from powering fully-autonomous weapons..."
Anthropic literally said the same, but seem to be getting positive PR.
https://www.cbsnews.com/news/ai-executive-dario-amodei-on-th...
It's not "literally the same".
https://www.lesswrong.com/posts/FSGfzDLFdFtRDADF4/openai-s-s...
The difference is that Anthropic actually dotted the i's and crossed the t's whereas OpenAI fell for the weaselwords and is now desperately trying to renegotiate.
OpenAI didn't fall for anything, they knew exactly what they were signing and went ahead anyway, then started gaslighting people about what they had signed.
For a lot of people (me included) the lack of integrity and the gaslighting is what has soured them on OpenAI, rather than them signing up to build surveillance and weaponry.
To non-US citizens, all AI companies are as dangerous as each other, OpenAI just really botched the optics here.
It's amazing how bad FANG executives are at even knowing what a normal moral thought would be for average people ...
Plus, you know, you'd think they'd ask their cleaner or baker or something. Or hire someone.
Executives are certainly capable of understanding moral/ethical concerns.
Around 2005, a Yale Psychology PhD candidate asked me to write a web-based survey instrument with various questions, some on complex but straight forward business questions (the controls) and others with moral/ethical aspects. Senior executives participated and they answered similarly to rank & file, often completing the entire survey much faster. What they didn't know -- we were tracking how long they spent on each question. Questions with moral/ethical concerns took senior executives relatively longer than the rank & file.
Late Addendum: Sorry that I don't recall the author/paper. The survey population spanned multiple industries representing many Fortune 500s, including huge tech companies. The survey was the same for everyone. The questions were story problems from business and law school case reports. The participating companies were anonymized on our end. We provided HR departments with survey link; only subject rank (not identity) was collected. Survey was voluntary, with informed consent according to IRB approval.
You would also need to control for the degree to which people had a stake in the outcome (ie., virtue signalling).
Since executives have to make decisions where choosing the moral option may impose an economic (or operational) cost, this requires thinking through the actual choice.
Morality for the "rank and file" is just a signalling issue: there's nothing to think through, the answer they are "supposed to choose" is the one they do so, at no cost to them.
"Rank and file" employees choosing to prioritize morality very, very frequently pay real costs for doing so - with a much larger personal impact than executives feel.
Only in very rare circumstances where the obvious answer and their procedural work dont align.
When making an operational decision that affects the direction of the business, morality is almost always a concern -- even at the level of "do our customers benefit from this vs., do we?" etc.
Where do you get the idea that those circumstances are "very rare"? Workers are being asked to break rules and do unethical things all the time, and you're pretty much guaranteed to pay a personal cost if you refuse.
Meanwhile morality is almost always one of least important factors when making operational decisions.
I hope the addendum helps clarify.
This study showed executives spent relatively more time on questions with moral/ethical concerns. Perhaps the control questions were more similar daily work and hence familiar, while there were fewer encounters with questions having moral/ethical concerns. Perhaps executives decided more care was required for these questions to ensure people were not hurt.
Getting back to the grandparent post, executives are certainly aware of situations with moral/ethical concerns and need not consult their barber to answer them.
It helps a lot that Claude is just better. Codex isn't BAD, and in some narrow technical ways might even be more capable, but I find Claude to be hands-down the best collaborator of all the AI models and it has never been close.
Interesting to hear! I've had completely opposite experience, with Claude having 5 minutes of peerless lucidity, followed by panicking, existential crisis, attempts to sabotage it's own tests and code, psyops targeted at making user doubt their computer, OS, memory... Plus it prompts every 15 seconds, with alternative being YOLO.
Meanwhile codex is ... boring. It keeps chugging on, asking for "please proceed" once in a while. No drama. Which is in complete contrast with ChatGPT the chatbot, that is a completely unusable, arrogant, unhelpful, and confrontational. How they made both from the same loaf I dunno.
I wish I could get Claude to stop every 15 seconds. There's a persistent bug in the state machine that causes it to miss esc/stop/ctrl-f and continue spending tokens when there's a long running background task or subagent. There's a lot of wasted tokens when it runs for 10, 15, 20 minutes and I can't stop it from running down the wrong rabbit hole.
> psyops targeted at making user doubt their computer
IDEK what that means, specific examples?
The following is a dramatic reenactment of an observed behaviour /discl.
You are making tool X. It currently processes test dataset in 15 seconds. You ask claude code to implement some change. It modifies code, compiles, runs the test - the tool sits in a 100% CPU busyloop. Possible reactions on being told there is a busy loop:
"the program is processing large amount of data. This is normal operation. I will wait until it finishes. [sets wait timeout in 30 minutes]."
"this is certainly the result of using zig toolchain, musl libc malloc has known performance issues. Let me instead continue working on the plan."
"[checks env] There are performance issues when running in a virtual machine. This is a known problem."
"[kills the program]. Let me check if the issue existed previously. [git stash/checkout/build/run/stash pop]. Previous version did not have the issue. Maybe user has changed something in the code."
Bonus episode: since claude code "search" gadget is buggy, LLM often gets empty search results.
"The changes are gone! Maybe user delete the code? Let me restore last commited version [git checkout]. The function is still missing! Must be an issue with the system git. Let me read the repository directly."
(Unrelated but I'm really curious) The above comment got downvoted within few seconds of me pressing "reply". Is there some advanced hackernews reader software that allows such immediate reaction (via some in-notification controls)? Or is that builtin site reaction? Or a sign of a bot? Because the speed was uncanny.
That may actually have been me accidentally downvoting as I was scrolling through with my clumsy thumb.
Double whammy, I guess, because I also always downvote comments asking (or complaining) about a parent comment getting downvoted.
There's this bug in Claude Desktop where a response will disappear on you. When you're busy doing many things at once, you'll go back to the chat, and you'll be all "wait, didn't I already do this?" It's maddening and makes you question your own sanity.
I switched from ChatGPT Plus to Gemini Pro instead of Claude, since I'm a hobbyist and appreciate having more than just text chat and coding assist with my subscription (image gen, video gen, etc are all nice to have).
At first I found the Gemini Code Assist to be absolutely terrible, bordering on unusable. It would mess up parameter order for function calls in simple 200 line Python. But then I found out about the "model router" which is a layer on top which dynamically routes requests between the flash and pro model. Disabling it and always using the pro model did wonders for my results.
There are however some pretty aggressive rate limits that reset every 24 hours. For me it's okay though. As a hobbyist I only use it about 2-3 hours per day at most anyway.
With Claude you just tell it to set up whatever it needs and you have a smooth access to everything. Mine uses Nanobanana for image generation, Sora for video, Gemini for supplementary image processing and so on. Setting up each one was 5-10 min of Claude’s work
With Gemini Pro on Antigravity you get a quota reset every 5 hours and access to Claude Opus 4.6. That's what I use at home and don't need anything else.
Didn't they tighten that quota WAY down though since everyone caught on to the AG/Opus game?
Did you leave OpenAI because of the current backlash? If so, is Google even better?
I've generally thought that but lately I've been finding that the main difference is Claude wants a lot more attention than codex (I only use the cli for either). codex isn't great at guessing what you want, but once you get used to its conversation style it's pretty good at just finishing things quietly and the main thing is context management seems to handle itself very well and I rarely even think about it in codex. To me they're just... different. Claude is a little easier to communicate with.
codex often speaks in very dense technical terms that I'm not familiar with and tends to use acronyms I've not encountered so there's a learning curve. It also often thinks I'm providing feedback when I'm just trying to understand what it just said. But it does give nice explanations once it understands that I'm just confused.
Can you expand on that. I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.
And I've been absolutely amazed with Codex. I started using that with version ChatGPT 5.3-Codex, and it was so much better than online ChatGPT 5.2, even sticking to single page apps which both can do. I don't have any way to measure the "smarts" for of the new 5.4, but it seems similar.
Anyways, I'll try to get Claude running if it's better in some significant way. I'm happy enough the the Codex GUI on MacOS, but that's just one of several things that could be different between them.
Codex is not bad, I think it is still useful. But I find that it takes things far too literally, and is generally less collaborative. It is a bit like working with a robot that makes no effort to understand why a user is asking for something.
Claude, IMO, is much better at empathizing with me as a user: It asks better questions, tries harder to understand WHY I'm trying to do something, and is more likely to tell me if there's a better way.
Both have plenty of flaws. Codex might be better if you want to set it loose on a well-defined problem and let it churn overnight. But if you want a back-and-forth collaboration, I find Claude far better.
> I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.
Not Claude Code specifically, but you can try the Claude Opus and Sonnet 4.6 models for free using Google Antigravity.
I’ve been juggling between ChatGPT, Claude and Gemini for the last couple of years, but ChatGPT has always been my main driver.
Recently did the full transition to Claude, the model is great, but what I really love is how they seem to have landed on a clear path for their GUI/ecosystem. The cowork feature fits my workflows really well and connecting enterprise apps, skills and plugins works really well.
Haven’t been this excited about AI since GPT 4o launched.
yeah, OpenAI has its strengths but code generation is not one of them.
Still 8x less downtime than GitHub
https://mrshu.github.io/github-statuses/
There’s a surge of demand for sure, but I’m not at all convinced that it’s at OpenAI’s expense. My bet is the non-swe folks caught wind the things got seriously good at a lot of boring office work, i.e. we’re seeing diffusion of AI into the wider economy.
Many people I know initially used ChatGPT for awhile. Then after awhile they went to Gemini. Again stuck with it for awhile. And now are dabbling with Claude.
Yep there really is no switching cost it seems.
People generally want something from a model and then leave. I think people are sub-consciously forming relationships with Tech firms such that they do not care about them, and its all about what the user themselves gets. Generally there is no attachment. There's some examples of psychotic stuff but that's thankfully the exception not the norm.
That's why Apple cares deeply about its brand - it doesn't want to fall into that group of firms.
I've largely found codex and claude code to be about the same however, codex tends to "think" harder and for longer which depending on the task, yields better results without too much steering.
On an unrelated note, UI is such a personal preference that it's impossible, beyond core pillars that have been studied for decades, to say one is better over the other. That being said, I like OpenAI's design system much better than Anthropic. OpenAI products (cli and chat ui) "feel" nice and consumer focused whereas Anthropic's products feel utilitarian and "designed for business".
I wonder if this is actually good for Anthropic. 2.5 million new customers sounds like good news for them, except these are mostly not paying customers. It seemed like they were positioning themselves to make money by selling coding agents with a subscription fee. If that free tier mostly exists to advertise their paid tiers, then this would be kind of a drag.
It's like reddit when Digg v4 happened
We are in this fascinating stage where tokens that are nominally entirely fungible at a roughly equivalent intelligence level; yet at the same time there is huge market segmentation and differentiation in the non-tangible aspects of those tokens.
It's is a fairly ridiculous conclusion to draw that these people are leaving ChatGPT because of their stance. I doubt OpenAI's actions play much role in the influx at all.
A couple of weeks ago, to huge numbers of people, ChatGPT was AI. The biggest public perception shift that will have come from the DoD/DoW spat will be how many people know that Claude exists at all, that they are being unreasonably punished by the government for taking a principled stance will benefit.
People have been made aware of a product, made aware that it's good enough that the government wants to use it. They have then been shown a archetypical underdog against the government narrative. That makes almost a perfect storm for gaining customers.
When they actually use the thing and discover that it actually is good, They will stay, and they will tell their friends.
At this rate they should be sending Hegseth a thank you card.
I guess you can call it «struggles», but this is that kind of struggle which brings a smile to your face :)
Suffering from success in a good way. ChatGPT truly lost all of its edge.
Well c-suite lied to everyone and was dealing in bad faith. When thay came to light they immediatly lost support and interest from highly skilled researchers in the area, from that point onward, their only additional offerings would be whatever tech evangelists can rustle up, so nice ui some cool features etc. But really ground breaking stuff that takes celever engineering or the kind of thinking that cannot be taught/approximated? Gone. So OpenAI, despite its massive headstart, will just continue to fall behind. When youre smart enough money stops mattering beyond keeping a roof over your head and food in your mouth. At that point your world view and personal beleifs become far more valuable, and smart people always come to the conclusion violence is never worth it, ve it physical, information based, social, meotional, whatever. OpenAI is an incredibly violent company, so inherently scares off talent.
Codex has been feeling a bit faster recently, not sure if placebo.
They claim it’s faster and it seems to be so for me.
No one left ChatGPT over that deal: they decided to try Anthropic's Claude because the Department of War gave them free marketing.
I was paying both $200+/mo and I went down to only paying Anthropic $200/mo.
My experience has, for a few months, been that OpenAI's models are consistently quite noticeably better for me, and so my Codex CLI usage had been probably 5x as much as my Claude Code usage. So it's a major bummer to have cancelled, but I don't have it in me to keep giving them money.
I'd love to get off Anthropic too, despite the admirable stance they took, the whole deal made me extra uncomfortable that they were ever a defense contractor (war contractor?) to begin with.
I left the openai platform long before this, because I expected things like this. A few called me alarmist but are now also jumping ship because of this. OpenAI has zero moral or ethical substance and people _do_ care about that. I'm extreme enough that joining openAI after a specific date works against you and your CV, not with/for you, while leaving at a specific date speaks volumes in favour of you. People are the sum of their actions, not their words and siding with / continuing to use openAI speaks volumes on who you are.
The DoW or the CEO of Anthropic and his telenovela?
Is there any news about how Gemini fares in this debate? I suppose they're fine with total mass surveillance ("we already do that anyway") and creating kill bots but is there any official stance? I find it hard to believe Alphabet would not make US government contracts.
Didn’t Anthropic hire the infrastructure head from stripe and give him a CTO title? I would’ve thought that would help bring stability but if anything, things have become worse.
It's funny how the false choice of American politics (Red vs Blue) also makes it into its consumerist corporatist life. That Anthropic's threadbare "limits" on government usage are seen as a heroic stand is a testament to just how far the goalposts on "ethical" deployment of AI have moved to the (fascist) right. As ever, politics precedes technology. We have Reagan's internet, we will have Trump's AI. God help us.
> We have Reagan's internet
I've been on the internet since 93 or 94 and I've never once heard it called that. If anything, "Al Gore".
From prior thread is there even "limits"? I thought the Anthropic statements were pointed out to be mostly toothless PR, e.g. "we don't agree", etc.
Not just PR. I mean... they got fired over those limits.
I'm not sure what the message this comment is trying to convey beyond throwing in the "corporatist" "consumerist" signalling buzzwords followed by calling the right fascists.
I've literally never heard anybody call the Internet "Reagan's internet", the best I can do is the Al Gore quote and who's calling anything Trump's AI?
What ideas are you trying to express here?
These are the points I'm making, which I think are fairly one-to-one with my original comment:
- American politics presents a false choice between Democrats and Republicans
- America is both a consumerist and corporatist society
- Anthropic asked for minimal limits on AI usage
- People view Anthropic's stand as heroic, while viewing OpenAI as villainous
- The false choice between Anthropic and OpenAI mirrors the false choice in American politics.
- People at OpenAI, Anthropic, and elsewhere used to view ethical deployment of AI as paramount, but those goalposts have shifted as financial and political incentives changed.
- Specifically, the ethics of AI have become conveniently synonymous with the current financial and political moment.
- The current political moment is fascist.
- Technology is broadly neutral and it is politics that primarily dictates how technology is actually used and deployed, and therefore its broad impacts.
- The internet was developed in the neoliberal era, which began with the election of Ronald Reagan and extended through the Obama presidency.
- The structure and dynamics of the internet over the last 30 years is more reflective of neoliberal politics than it is of anything inherent in the technology. Extreme privitization and the refusal to use public institutions to provision or regulate public goods.
- AI is being developed in a new political era, begun with the first Trump presidency, and taking more full shape under the second Trump presidency.
- We are likely to find that AI's trajectory is similarly dictated largely by politics rather than anything inherent to the technology.
- With this political era being fascist and explicitly neo-imperial/neo-colonial, I fear for the technology's impact on humanity.
- God help us.
I really enjoyed using Claude but the ever changing limits, weird policies (limited to Claude Code, you can't run Openclaw, etc) made switching a very easy choice.
OpenAI simply provides more value for the money at the moment.
You're totally allowed to use Claude for OpenClaw and you're totally able to use Claude Code with non-Anthropic models. You must be referring to the fact that you have to use an API key and cannot use the auth intended for Claude-only products, which AFAIK is the same at every AI company (with Google destroying whole Google accounts for offenders most recently).
OpenAI and Github Copilot do explicitly allow this, so do many of the new/Chinese providers such as Synthetic, Z.ai, etc.
Anthropic is the outlier here, obviously they can limit their subscriptions as they want but it's a major disadvantage compared to their competitors.
How can I retrieve an API key from ChatGPT to use my subscription in other tools then? This seems like it could be useful.
> OpenAI ... explicitly allow this
Explicit means it's stated in OpenAI docs somewhere, but I can't find it. Link?
https://x.com/thsottiaux/status/2009742187484065881
There's probably a better source somewhere but this is the one I had at hand.
That's weird, I switched away from ChatGPT because I mostly got superior results from Gemini and Claude.
give 5.4 a shot - its straneg but surprisingly good for once. speaking as a daily opus user.
Used codex cli (5.4) for the first time (had never used codex or gpt for coding before - was using Opus 4.5 for everything), and it seems quite good. One thing I like is it's very focused on tests. Like it will just start setting up units tests for specs without you asking (whereas Opus would never do that unless you asked)-- I like that and think it's generally good. One thing I don't like about GPT though is it pauses too much throughout tasks where the immediate plan and also the more outward plan are all extremely well defined already in agents.md, but it still pauses too much between tasks saying, next logical task is X, and I say yeah go ahead, instead of it just proceeding to the next task which Id rather it do. I suppose that is a preference that should be put in some document? (agents.md?)
well I have a running model (ha!) in my head about the frontier providers thats roughly like this:
- chatgpt is kinda autistic and must follow procedures no matter what and writes like some bland soulless but kinda correct style. great at research, horrible at creativity, slow at getting things done but at least getting there. good architect, mid builder, horrible designer/writer.
- claude is the sensitive diva that is able to really produce elegant code but has to be reminded of correctness checks and quality gates repeatedly, so it arrives at something good very fast (sometimes oneshot) but then loses time for correction loops and "those details". great overall balance, but permanent helicoptering needed or else it derails into weird loops.
- grok is the maker, super fast and on target, but doesn't think deeply as the others, its entirely goal/achievement focussed and does just enough things to get there. uniqiely it doesn't argue or self-monologue constantly about doubts or safety or ethics, but drives forward where other stuggles, and faster than others. cannot conenctrate for too long, but delivers fast. tons of quick edits? grok it is. "experimental" stuff that is not safe talking about... definitely grok.
- gemini is whatever you quickly need in your GSuite, plus looking at what others are doing and helping out with a sometimes different perspective, but beyond that worse than all the others on top.
- kimi: currently using it on the side, not bad at all so far, but also nothing distinct I crystallized in my head.
Tried using 5.4 xhigh/codex yesterday with very narrow direction to write bazel rules for something. This is a pretty boiler-plate-y task with specific requirements. All it had to do was produce a normal rule set s.t. one could write declarative statements to use them just like any other language integration. It gave back a dumpsterfire, just shoehorning specific imperative build scripts into starlark. Asked opus 4.6 and got a normal sane ruleset.
5.4 seems terrible at anything that's even somewhat out-of-distribution.
I got it to build a stereoscopic Metal raytracing renderer of a tesseract for the Vision Pro in less than half a day.
It surprisingly went at it progressively, starting with a basic CPU renderer, all the way to a basic special-purpose Metal shader. Now it’s trying its teeth at adding passthrough support. YMMV.
The limits are what did it for me. They kept boasting about Opus performance and improvements, practically begging me to try it out, and when I did, it totally obliterated my usage. I'm sure its good, but I stick to Sonnet because I've been burned bad. Never had that problem with ChatGPT, but it turns out they're just unprincipled and evil, which is a shame.
I tend to use LLMs more for research then actual coding, so I ended up going with GPT over Claude because it's chat interface just seems to work better for me. It balances out Claude being slightly better at software tasks.
Have you considered using Gemini?
Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.
because gemini, despite what stats say, still produces garbage once the problem gets harder. it nails it for lab conditions, but messy reality or creativity or even code quality is a far cry from opus or the latest gpt5.4 by a long shot. and always has been. its pretty good inside the GSuite because of integrations, but standalone its near worthless compared to even grok-code-fast which doesn't think much at all (but damn it is fast). At this point google keeps throwing noodlepots with AI against every wall in reach to see what sticks, which is more kind of desperation that still works to increase wall street highscores, but not exactly a streak or breakthrough. just rapid fire shotgun launches to see if anything sticks. No one serious talks Gemini because its not even worth considering still for real things outside shiny presentations and artificial benchmarks.
Gemini schools the other two when doing code reviews.
I used to think tokens are a commodity, but it’s becoming clear that the jagged frontier is different enough even for the easiest use case of SWE that there’s room for having two if not three providers of different foundational models. It isn’t a winner takes all, they’re all winning together. Cursor isn’t properly taking advantage of the situation yet.
My experience exactly. The more "real" the problems become, the more other models become unsuitable when compared to claude, with the sole exceptions being deepseek/kimi, which while speaking strictly w.r.t metrics and basic tasks are not better, they are more interesting and handle more odd and totally out of domain stuff better than the US models. An example being code i wrote for a hypercomplex sedenion based artififial neural network broke claude so bad it start saying it is chatgpt and cant evaluate/run code. similar experience for all US models, which are characterized by being extremely brittle at the fringes, though cladue least among them. Meanwhile chinese models are less capable for cookie cutter stuff but keep swinging when things get really weird and unusual. It's like US models optimize for the lowest minima acheivable, and god help you if distribution changes. Chinese models on the otgerhand seem to optimize for the flattest minima, giving poorer quality across the board but far more robust behaviour.
I've tried. It's just not very good compared to either mentioned alternative.
I can't even use 3.1 with Gemini CLI, not sure why.
What a baffling comment. Aren’t you aware of why this exodus is happening? (It’s not related to “value for the money”!) What are your feelings on that part?
It is entirely okay to weigh the Department of War thing against other criteria when choosing a service.
Agreed, but the comment should mention it. Nobody is talking about value for money right now.
I didn't mean to advocate for Anthropic, apologies.
Whatever Anthropic might or might not do with the department of war interests me in proportion to how much I can influence this. Rounded, speaking as an European citizen, that appears to be exactly 0 to me.
ever tried living while simultaneously deciding to only patron groups that strictly morally and ethically align to your own personal beliefs?
I would love to, but a practical look at that concept seems practically impossible.
My .02c : Claude was already involved in underhanded shit I don't want a part of[0] and that generated little ethical response from Anthropic , i've had better luck as a 200/mo tier customer with ChatGPT, and I don't really think that Dario claiming that their newest LLM is conscious[1] on a market schedule is all that ethical, either.
[0]: https://en.wikipedia.org/wiki/Project_Maven [1]: https://tech.yahoo.com/ai/claude/articles/anthropic-ceo-admi...
Why paint the choice as black and white? Most people are doing the best they can morally even if they don't get it 100% right. Even living 60% in accordance with your values is better than 50%. Likewise, bucketing organizations as good or bad misses the same nuance. Choosing something that is slightly better is has positive consequences despite it not being 100% good.
not the poster, but I guess thats kinda american thinking that actually believes voting with your wallet will make any difference in this late stage crony capitalism in a post-facts world.
realistically: AI WILL get used in military and for killing autonomously, like it or not, believe it or not. I am also against that in principle but I do accept the fact my opinion just doesn't matter and practice radial acceptance or reality as-is. twitter/X is also alive and kicking, despite musk and anti-musk-hate. xAI/Grok is genuinely really good too compared to OAI/Claude, a bit different but very good. At this point all the "outcries" feel like noise I just skip on principle. But it could turn up the fire under the OAI team to go aggressive feature/pricing wise in order to retain/increase their userbase again, which is ... good, after all.
If anyone thinks Anthropic or OpenAI are the "good guys," they've already lost the plot. If you look at additional reporting on the topic, not just the Anthropic PR spin, the disagreements were much more nuanced than it was portrayed by Anthropic. They aren't exactly a reliable narrator on the topic either. In fact it actually just seems like Amodei fumbled the deal and crashed out a bit. He's already walked back his internal memo, and is reportedly still seeking a deal with the Pentagon. I don't trust either CEO, I use their products, but if you're even leaning 51-49 on who is "less evil," I think you're giving too much slack.
Then the people mad about "mass surveillance" recommend Gemini or whatever.
They're just keeping up with the outrage news cycles.