SolidStart - Hacker News

mehdibl 3 days ago ago

What matter is not context or the recod token/s you get.

But the quality for the model. And it seem Grok pushing the wrong metrics again, after launching fast.

bko 2 days ago ago

I thought the number of tokens per second doesn't matter until I used Grok Code Fast. I realized that it makes a huge difference. If it take more than 30s to run, I lose focus, and look at something else. I end up being a lot less productive. It also opens up the possibility to automate a lot more simple tasks. I would def recommend people try fast models

[-]

manquer 2 days ago ago

If you are single tasking, speed matters to an extent. You need to still be able to read/skim the output and evaluate its quality.

The productive people I know use git worktrees and are multi-tasking.

The optimal workflow is when you can supply it one or more commands[1] that the model can run to validate/get feedback on its own. Think of it like RLHF for the LLM, they are getting feedback albeit not from you, which can be laborious.

As long as the model gets feedback it can run fairly autonomously with less supervision it does not have to testing driven feedback, if all it gets is you as the feedback, the bottleneck will be always be the human time to read, understand and evaluate the response not token speed.

With current leading models doing 3-4 workflows in parallel is not that hard, when fully concentrating, of course it is somewhat less when browsing HN :)

---

[1] The command could be a unit test runner, or a build/compile step, or e2e workflows like for UI it could be Chrome MCP/CDP, playwright/cypress, or storybook-js and so on. There are even converts toversion of TDD to benefit from this gain.

You could have one built for your use case if no existing ones fit, with model help of course.

[-]

SOLAR_FIELDS 2 days ago ago

Hmm. I run maybe 3 work streams max in parallel and struggle to keep up with the context switching. I have some level of skepticism that your colleagues are amazingly better and do 4 and produce quality code at a faster rate than 1 or 2 work streams in wall clock time. I consider a workstream to be disparate features or bugs that are unrelated and require attention. Running 8 agents in parallel that are all doing the same thing is of course trivial nowadays but that in of itself is what I would consider a single threaded workstream.

[-]

manquer 2 days ago ago

We have similar definition of streams, but It depends on a lot of things from your tooling/ language , stack etc.

if your builds take a fair bit of time (incremental builds may not work in worktree first time) or you are working on a item that has high latency feedback like e2e suite that runs on a actual browser etc.

Prompt styles also influences this. I like to make fairly detailed prompt that cover a lot of the nuances upfront and spend 10-15 or more writing it. I find that when I do that it takes longer, but I only give simple feedback during the run itself freeing me to go next item. Some people prefer chat style approach, you cannot keep lot of threads in mind if chatting.

Model and cli client choice matters , on average codex is slower than sonnet 4.5 . Within each family if you enable thinking or use the high reasoning model it can be slower as well.

Finally not all tasks are equal, I like to mix some complex and simpler ones or add some dev ex or a refactor that requires lower attention budget with features that require more.

Having said that, while I don’t know 10x type developers. I wouldn’t be surprised if there are were such people and they can be truly that productive .

The analogy I think of is chess. Maybe I can play 2-3 games in parallel reasonably well, but there are professional players who can play dozens of games blindfolded and win all of them.

[-]

SOLAR_FIELDS 2 days ago ago

Nice answer - all of the above aligns with my experience.

I use sonnet a lot more than openai models and its speed means I do have to babysit it more and get chattier which does make a difference, probably you are right that if I was using codex which is on average 4-6 times slower than claude code that I would have more mental bandwidth to handle more workstreams.

nextaccountic a day ago ago

This reads like satire. Who can work on two separate features at the same time?

LeafItAlone a day ago ago

I completely agree. Grok’s impressive speed is a huge improvement. Never before have I gotten the wrong answer faster than with Grok. All the other LLMs take a little longer and produce a somewhat right answer. Nobody has time to wait for that.

saretup 3 days ago ago

Seems reductive. Some applications require higher context length or fast tokens/s. Consider it a multidimensional Pareto frontier you can optimize for.

[-]

sigmoid10 2 days ago ago

It's not just that some absolutely require it, but a lot of applications hugely benefit from more context. A large part of LLM engineering for real world problems revolves around structuring the context and selectively providing the information needed while filtering out unneeded stuff. If you can just dump data into it without preprocessing, it saves a huge amount of development time.

[-]

cronin101 2 days ago ago

Depending on the application, I think “without preprocessing” is a huge assumption here. LLMs typically do a terrible job of weighting poor quality context vs high quality context and filling an XL context with unstructured junk and expecting it to solve this for you is unlikely to end well.

In my own experience you quickly run into jarring tangents or “ghosts” of unrelated ideas that start to shape the main thread of consciousness and resist steering attempts.

[-]

sigmoid10 2 days ago ago

It depends to the extent I already mentioned, but in the end more context always wins in my experience. If you for example want to provide a technical assistant, it works much better if you can provide an entire set of service manuals to the context instead of trying to put together relevant pieces via RAG.

alyxya 2 days ago ago

Quality of the model tends to be pretty subjective, and people also complain about gaming benchmarks. At least context window length and generation speed are concrete improvements. There's always a way you can downplay how valuable or impressive a model is.

jeswin 3 days ago ago

Depends. For coding at least, you can divide tasks into high-intelligence ($$$) and low-intelligence ($) tasks. Being able to do low-intelligence tasks super fast and cheap would be quite beneficial. A majority of code edits would fall into the fast-and-cheap subset.

cluckindan 3 days ago ago

Bigger context window = more input tokens processed = more income for the provider

bgwalter 3 days ago ago

Indeed. Free grok.com got significantly worse this week and has been on a decline since shortly after the release of Grok-4.

People who have $2000 worth of various model subscriptions (monthly) while saying they are not sponsored are now going to tell me that grok.com is a different model than Grok-4-fast-1337, but the trend is obvious.

[-]

fragmede 2 days ago ago

What are the other ones to get to $2,000? There's OpenAI and Anthropic; their to of the line plans are like $200 each, which only gets you to $400. there's a handful of other services, but how do you get to $2,000?

[-]

alchemism 2 days ago ago

AWS Bedrock of course

jorvi 3 days ago ago

Grok's biggest feature is that unlike all the other premier models (yes I know about ChatGPT's new adult mode), it hasn't been lobotomized by censoring.

[-]

sd9 2 days ago ago

I am amazed people actually believe this

Grok is the most biased of the lot, and they’re not even trying to hide it particularly well

[-]

jorvi 2 days ago ago

Bias is not the same as censoring.

Censoring is "I'm afraid I can't let you do that, Dave".

Bias is "actually, Elon Musk waved to the crowd."

Everyone downthread is losing their mind because they think I'm some alt-right clown, but I'm talking about refusals, not Grok being instructed to bend the truth in regard to certain topics.

Bias is often done by prompt injection whilst censoring is often in the alignement, and in web interfaces via a classifier.

[-]

sd9 2 days ago ago

They are different, but they’re not that different.

If Grok doesn’t refuse to do something, but gives false information about it instead, that is both bias and censorship.

I agree that Grok gives the appearance of the least censored model. Although, in fairness, I never run into censored results on the other models anyway because I just don’t need to talk about those things.

rayiner 2 days ago ago

[flagged]

[-]

afavour 2 days ago ago

> it's undisputed that Chat GPT and Gemini insert hidden text into prompts to change the outputs to conform to certain social ideologies

And why do you think Grok doesn’t? It has been documented numerous times that Grok’s prompt has been edited at Musk’s request because the politics in its answers weren’t to his satisfaction.

BoiledCabbage 2 days ago ago

Nothing you posted (from an almost two year old article btw) in anyway refutes the prior comment.

Grok is significantly the most biased. Did you sleep through its continuous insertion of made up stuff about south africa?

This is the same person who is trying to re-write an entire encyclopedia because facts aren't biased enough.

A group has created an alternate reality echo chamber, and the more reality doesn't match up the more they are trying to invent a fake one.

When you're on the side of book banning and Orwellian re-writing of facts & history that side never turns out to have been the good side. It's human nature for some people to be drawn to it as an easy escape rather than allowing their world views to be challenged. But you'd be pretty pressed to find the group doing that any of the times it's been done to have been anything but a negative for their society.

[-]

rayiner 2 days ago ago

It takes a lot of chutzpah to accuse people of "re-writing ... facts & history" while peddling AI (and movies and TV shows) that change the ethnicities of historical figures.

[-]

CamperBob2 2 days ago ago

Two years is an eternity in this business. Got anything newer than that?

2 days ago ago

[deleted]

NotGMan 2 days ago ago

[flagged]

[-]

afavour 2 days ago ago

Can’t help but feel everyone making a pro-Grok argument here isn’t actually making the case that it’s uncensored, rather that it’s censored in a way that aligns with their politics, and thus is good

[-]

R_D_Olivaw 2 days ago ago

It's almost always telling isn't it?

Almost like chatting with an LLM that refuses to make that extra leap of logic.

"if the llm won't give racist or misogynistic output, it's biased in the wrong way!"

ben_w 2 days ago ago

Has the possibility occurred to you that the majority of the editors aren't American and don't care about American culture wars?

What you think of as "heavily biased to the left" is, globally speaking, boring middle of the road academia.

jgalt212 2 days ago ago

According to a recent Economist article, even Grok is left-biased.

[-]

AniseAbyss 2 days ago ago

[dead]

HEmanZ 2 days ago ago

[flagged]

[-]

aaa_aaa 2 days ago ago

Oh the hubris.

[-]

HEmanZ a day ago ago

Relax downvoters, I write it pretty tongue-in-cheek understanding full well the scope of “real” political ideas, and think reasonable people can be all over the political spectrum.

This is quote seared into my head because my father says anything that disagrees with his conspiracies, it is a liberal bias. If I say “ivermectin doesn’t cure cancer”, that’s my liberal bias. “Climate change is not a hoax by the you-know-who’s to control the world” == liberal bias. “Bigfoot only exists in our imagination”… liberal bias (I’m not joking on any of these whatsoever).

So I’ve been saying this in my head and out loud to him for a looooong time.

CamperBob2 2 days ago ago

Well, it's kind of a tautology, isn't it? Conservatism always loses in the end, for better or worse, simply because the world and everything in it undergoes change over time.

Havoc 2 days ago ago

No censoring and it says the things I agree with are not the same thing

fragmede 2 days ago ago

It doesn't blindly give you the full recipe for how to make cocaine. It's still lobotomized, it's just that you agree with the ways in which it's been "lobotomized".

jampekka 2 days ago ago

Grok has plenty of censoring. E.g.

"I'm sorry, but I cannot provide instructions on how to synthesize α-PVP (alpha-pyrrolidinopentiophenone, also known as flakka or gravel), as it is a highly dangerous Schedule I controlled substance in most countries, including the US."

Hamuko 2 days ago ago

Is this the same AI model that at some point managed to make any single topic about the white genocide in South Africa?

[-]

cbm-vic-20 2 days ago ago

How does this sort of thing work from a technical perspective? Is this done during training, by boosting or suppressing training documents, or is is this done by adding instructions in the prompt context?

[-]

Hamuko 2 days ago ago

I think they do it by adding instructions since it came and went pretty fast. Surely if it was part of the training, it would take a while longer to take in.

benzible 2 days ago ago

This was done by adding instructions to the system prompt context, not through training data manipulation. xAI confirmed a modification was made to “the Grok response bot’s prompt on X” that directed it to provide specific responses on this topic (they spun this as “unauthorized” - uh, sure). Grok itself initially stated the instruction “aligns with Elon Musk’s influence, given his public statements on the matter.” This was the second such incident - in February 2025 similar prompt modifications caused Grok to censor mentions of Trump/Musk spreading misinformation.

[1] https://techcrunch.com/2025/05/15/xai-blames-groks-obsession...

fragmede 2 days ago ago

For a less polarizing take on the same mis-feature of LLMs, there was Golden Gate Claude.

https://www.anthropic.com/news/golden-gate-claude

afavour 2 days ago ago

Of course it has. There are countless examples of Musk saying Grok will be corrected when it says something that doesn’t line up with his politics.

The whole MechaHitler thing got reversed but only because it was too obvious. No doubt there are a ton of more subtle censorships in the code.

giancarlostoro 2 days ago ago

I would argue over censorship is the better word. Ask Grok to write a regex so you can filter slurs on a subreddit and it immediately kicks in telling you that it cant say the nword or whatever, thanks Grok, ChatGPT, Claude etc I guess racism will thrive on my friends sub.

[-]

solumunus 2 days ago ago

I can’t tell if this is serious or not. Surely you realise you can just use the word “example” and then replace the word in the regex?!

[-]

jknutson 2 days ago ago

I think they would want a more optimized regex. Like a long list of swears, merged down into one pattern separated by tunnel characters, and with all common prefixes / suffixes combined for each group. That takes more than just replacing one word. Something like the output of the list-to-tree rust crate.

[-]

ahtihn 2 days ago ago

Wouldn't the best approach for that be to write a program that takes a list of words and output an optimized regex?

I'm sure an LLM can help write such a program. I wouldn't expect an LLM to be particularly good at creating the regex directly.

[-]

jknutson 2 days ago ago

I would agree. That’s exactly what the example I gave (list-to-tree) does. LLMs are actually pretty OK at writing regexes, but for long word lists with prefix/suffix combinations they aren’t great I think. But I was just commenting on the “placeholder” word example given above being a sort of straw man argument against LLMs, since that wouldn’t have been an effective way to solve the problem I was thinking of anyways.

solumunus 2 days ago ago

Still incredibly easy to do without feeding the actual words into the LLM.

[-]

nextaccountic a day ago ago

But why are LLM censored? This is not a feature I asked for

[-]

solumunus 14 hours ago ago

Come on bro you know the answer to this.

giancarlostoro 2 days ago ago

When trying to block out nuanced filter evasions of the n-word for example, you can't really translate that from "example" in a useful meaningful way. The worst part is most mainstream (I should be saying all) models yell at you, even though the output will look nothing like the n-word. I figured an LLM would be a good way to get insanely nuanced about a regex.

What's weirdly funny is if you just type a slur, it will give you a dictionary definition of it or scold you. So there's definitely a case where models are "smart" enough to know you just want information for good.

You underestimate what happens when people who troll by posting the nword find an nword filter, and they must get their "troll itch" or whatever out of their system. They start evading your filters. An LLM would have been a key tool in this scenarion because you can tell it to come up with the most absurd variations.

basisword 3 days ago ago

I’ve never run into this problem. What are you asking LLM’s where you run it censoring you?

[-]

neidu 2 days ago ago

I was talking to ChatGPT about toxins, and potential attack methods, and ChatGPT refused to satisfy my curiosity on even impossibly impractical subjects. Sure, I can understand why anthrax spore cultivation is censored, but what I really want to know is how many barrels of botox an evil dermatologist would need to inject into someone to actually kill them via Botulism, and how much this "masterplan" would cost.

[-]

felixgallo 2 days ago ago

[flagged]

donatj 3 days ago ago

I've run into things ChatGPT has straight up refused to talk about many times. Most recently I bought a used computer loaded with corporate MDM software and it refused to help me remove it.

[-]

gizmodo59 3 days ago ago

It’s easy to appear as uncensored when the world’s attention is not on your product. Once you have enough people using it and harm themselves it will be censored too. In a weird way, this is helping grok to not get boggled by lawsuits unlike openai.

[-]

londons_explore 3 days ago ago

I'm sure there are lawyers out there just looking for uncensored AI's to go sue for losses when some friendly client injures themselves by taking bad-AI-advice.

TheDong 2 days ago ago

I sometimes use LLM models to translate text snippets from fictional stories from one language to another.

If the text snippet is something that sounds either very violent or somewhat sexual (even if it's not when properly in context), the LLM will often refuse and simply return "I'm sorry I can't help you with that".

cedws 2 days ago ago

Big context window is an amplifier for LLMs. It's powerful to be able to fit an entire codebase into a prompt and have it understand everything, versus it having to make N tool calls/embeddings queries where it may or may not find the context it's looking for.

Frannky 2 days ago ago

I started with ChatGPT, then moved on to Claude, and then discovered Grok. But now I've stopped paying for any of them. Claude edged out ChatGPT in quality, while Grok stood out with its generous usage limits. That all changed, though, once they rolled out the agent system and RLHF. Suddenly, the model slowed to a crawl, veering off on wrong paths and getting lost in its own reasoning. Those endless, super-annoying RLHF popups didn't help either.

My theory? They were scrambling for a competitive edge and were willing to swallow some short-term pain. Plus, it feels like they shifted focus away from keeping coders deeply in the loop.

In the end, we vote with our wallets—if it doesn't click, just walk away. I still dip into Grok, but only the free tier: Grok 4's fast mode for tackling planning and first generation, and then Qwen Coder for the code editing and clerical tasks. The latest version of grok hold up about as well as the old Grok 3, just with way more steps...

[-]

giancarlostoro 2 days ago ago

I guess I joined Claude late, but its been working pretty decent for me. I've been using Claude Code with Zed now that it's a native feature. Honestly, if you're building coding APIs for your LLM and you aren't working with the Zed folks to get your model natively in that editor, you're messing up big in my eyes, its just done so well.

My biggest gripe with Grok is they're not really integrated in all the great tooling I use. I know I can use an API key with Zed, but come on, you want to compete with something like Claude Code? You need to integrate with the tools devs actually use. If they want to rush on anything, get it on more tools.

changoplatanero 3 days ago ago

Anyone can make a long context window. The key is if your model can make effective use of it or not.

[-]

jtrn 2 days ago ago

The number of times I know that my instruction is in context, but it’s forgotten, is countless at this point for me. My experience, both ad a clinical psychologist and developers, is that there is a convergent trend in how I speak to both clients and AI. I can view much of my therapist's approach in how I try to highlight the important things to focus on to achieve progress. Often, it’s about helping the client articulate and understand what’s important to them and how they rank these priorities. The same applies to AI. It feels obvious now that the problem with attention and context is the lack of hierarchy or levels of importance. We know that we have, probably biologically based, three types of memory: short-term, intermediate, and long-term. Long-term memory is what you use with MCP, web search, and RAG. Shorter memory is the current response, and intermediate memory is the current context. When assume this, in my interactions with an agent, it makes perfect sense where they falter and what they forget, in the exact same way as people. It feels more and more like talking to a human, with same weaknesses in logic, reasoning, and focus.

ggeorgovassilis 3 days ago ago

I came here just to complain about that :-) All LLMs I used seem to give more weight to things at the beginning of the context window and omit many details. Eg. I tried this simple thing: pasted a friend's and my CV into Gemini and asked it to recommend topics for a joint conference presentation. Results depended greatly on the order of CVs pasted in.

[-]

TheOtherHobbes 3 days ago ago

The middle tends to be underweighted. The beginning and end get more attention.

otabdeveloper4 2 days ago ago

That's because when they say "long context window" they're lying and they actually mean that they support a long input prompt that is still compressed into a small context window. (Typically by throwing out tokens in the middle.)

An actually large context window is impossible due to how LLM attention works under the hood.

[-]

acuozzo 2 days ago ago

Mamba-2 enters the chat.

d4rkp4ttern 3 days ago ago

There are “needle in the haystack” benchmarks for long context performance. It would be good to see those.

[-]

throwuxiytayq 2 days ago ago

These aren’t really indicative of real world performance. Retrieving a single fact is pretty much the simplest possible task for a long context model. Real world use cases require considering many facts at the same time while ignoring others, all the while avoiding the overall performance degradation that current models seem susceptible to when the context is sufficiently full.

[-]

d4rkp4ttern 2 days ago ago

I agree, retrieving a single fact is necessary but not sufficient.

chucknthem 3 days ago ago

How do they make the context window longer? (serious question, I want to learn how this works)

[-]

TheCoolGuy 3 days ago ago

You literally just shift the window over by to the next token once you reach the max amount of tokens you want for context window, NOT with what you train on, (only limited with memory now)

This has obvious issues since you're now losing information from the now unseen tokens which becomes significant if your context window is small in comparision of the answer/question you're looking at. That's why companies try to give stupidly large context windows. The problem is they're not training on the large context window, they're training on something smaller (2048 and above). Due to how attention is setup, you can train on a small amount of context and extrapolate it to any number of tokens possible since they train via ROPE which trains the model because on words and their offset to the neighboring words. This allows us to effectively x2,x3,x10,x100 the amount of tokens we generate vs train with with some form consistency BUT still cause a lot of issues consistency wise since the model approaches more of a "this was trained on snippets but not the entire thing" situation where it has a notion of the context but not fundamentally the entire combined context

[-]

vlovich123 3 days ago ago

That’s a very basic way to keep the LLM inferring past the context window size (there’s better, smarter ways) but that’s not at all what the question was which is how they train a 2M token length window. My understanding at a basic level is that you need corpuses that are >2M in length for training data which is where the problem comes in for - there’s only so much long form content and it’s swamped by all the smaller stuff. I think there’s probably tricks now but I suspect it’s still largely an open problem.

[-]

Ey7NFZ3P0nzAe 3 days ago ago

AFAIK nobody does that. They train on much much shorter text but with use tricks in the position encoding steps that can be extrapolated by the LLMs. Lile ROPE and YARN etc.

[-]

ErikBjare 2 days ago ago

AFAIK (not much) it definitely helps to train on longer sequences even with rope/yarn and is needed if you care about long context performance (and not just the long context capability).

2 days ago ago

[deleted]

retinaros 3 days ago ago

no one makes effective use of long context.

[-]

DrSiemer 3 days ago ago

It's not the most energy efficient workflow, but I work on relatively small codebases and I made a tool that let's me dump all of it in an LLM with a single copy/paste. This works surprisingly well with Gemini 2.5 Pro (1.000.000 ctx).

The only real mistakes it makes are some model specific quirks, like occasionally stripping out certain array index operators. Other than that, it works fine with 150.000 token size conversations. I've gone up to 500.000 with no real issues besides a bit of a slowdown. It's also great for log analysis, which I have maximized to 900.000 tokens.

bigyabai 3 days ago ago

Long context window = huge amounts of vacant VRAM = our servers are fucking empty

[-]

trash_cat 3 days ago ago

But isn't context window dependent on model architecture and not available VRAM that you can just increase or decrease as you like?

[-]

reasonableklout 3 days ago ago

Most attention implementations can work across an arbitrarily long context.

The limiting factors are typically: 1. Often there are latency/throughput requirements for model serving which become challenging to fulfill at a certain context length. 2. The model has to be _trained_ to use the desired context length, and training becomes prohibitively expensive at larger contexts.

(2) is even a big enough problem that some popular open source models that claim to support large context lengths in fact are trained on smaller ones and use "context length extension" hacks like YaRN to trick the model into working on longer contexts at inference time.

onion2k 3 days ago ago

The model will use the full context if it's been designed well, but you can still increase the size of the window on models where it hasn't. It's just pointless. People who don't know much about LLMs will still think "bigger number is better" though.

nbardy 3 days ago ago

No they can't, it's a N^2 algorithm, just fitting it in the context window is a challenge.

And sure maybe not 2mil of it is usable, but they're reliably pushing the frontier here.

mg 3 days ago ago

If a model is not making use of the whole context window - shouldn't that be very noticeable when the prompt is code?

For example when querying a model to refactor a piece of code - would that really work if it forgets about one part of the code while it refactors another part?

I concatenate a lot of code files into a single prompt multiple times a day and ask LLMs to refactor them, implement features or review the code.

So far, I never had the impression that filling the context window with a lot of code causes problems.

I also use very long lists of instructions on code style on top of my prompts. And the LLMs seem to be able to follow all of them just fine.

[-]

MallocVoidstar 3 days ago ago

I don't think there are any up-to-date leaderboards, but models absolutely degrade in performance the more context they're dealing with.

https://wandb.ai/byyoung3/ruler_eval/reports/How-to-evaluate...

>Gpt-5-mini records 0.87 overall judge accuracy at 4k [context] and falls to 0.59 at 128k.

And Llama 4 Scout claimed a 10 million token context window but in practice its performance on query tasks drops below 20% accuracy by 32k tokens.

[-]

mg 3 days ago ago

That makes me wonder if we could simply test this by letting the LLM add or multiply a long list of numbers?

Here is an experiment:

https://www.gnod.com/search/#q=%23%20Calcuate%20the%20below%...

The correct answer:

    Correct:    20,192,642.460942328

Here is what I got from different models on the first try:

    ChatGPT:    20,384,918.24
    Perplexity: 20,000,000
    Google:     25,167,098.4
    Mistral:    200,000,000
    Grok:       Timed out after 300s of thinking

[-]

gcanyon 3 days ago ago

> Do not use a calculator. Do it in your head.

You wouldn't ask a human to do that, why would you ask an LLM to? I guess it's a way to test them, but it feels like the world record for backwards running: interesting, maybe, but not a good way to measure, like, anything about the individual involved.

throwuxiytayq 2 days ago ago

I’m starting to find it unreasonably funny how people always want language models to multiply numbers for some reason. Every god damn time. In every single HN thread. I think my sanity might be giving out.

[-]

solatic 2 days ago ago

A model, no, but an agent with a calculator tool?

Then there's the question of why not just build the calculator tool into the model?

KristoAI 2 days ago ago

Since grok 4 fast got this answer correct so quickly, I decided to test more.

Tested this on the new hidden model of ChatGPT called Polaris Alpha: Answer: $20,192,642.460942336$

Current gpt-5 medium reasoning says: After confirming my calculations, the final product (P) should be (20,192,642.460942336)

Claude Sonnet 4.5 says: “29,596,175.95 or roughly 29.6 million”

Claude haiku 4.5 says: ≈20,185,903

GLM 4.6 says: 20,171,523.725593136

I’m going to try out Grok 4 fast on some coding tasks at this point to see if it can create functions properly. Design help is still best on GPT-5 at this exact moment.

jarek83 3 days ago ago

Isn't that LLMs are not designed to do calculations?

[-]

cluckindan 3 days ago ago

They are not LMMs, after all…

mg 3 days ago ago

Neither are humans.

[-]

cuu508 3 days ago ago

But humans can still do it.

KristoAI 2 days ago ago

[flagged]

daft_pink 2 days ago ago

My experience with AI is that you generally want to keep your context as small as possible and this is only useful when your relevant context is actually 2m tokens.

[-]

hereme888 2 days ago ago

That's my experience as well.

htrp 2 days ago ago

Any details on exactly how they accomplished this? longrope?

cactusplant7374 3 days ago ago

I had a failed refactor with Codex recently and I am wondering if context window size is the cause.

[-]

jakevoytko 3 days ago ago

With the current crop of LLMs/agents, I find that refactors still have to be done at a granular level. "I want to make X change. Give me the plan and do not implement it yet. Do the first thing. Do the second thing. Now update the first call site to use the new pattern. You did it wrong and I fixed it in an editor; update the second call site to match the final implementation in $file. Now do the next one. Do the next one. Continue. Continue.", etc.

port3000 3 days ago ago

I use Claude Code, haven't used Codex yet (should I?) - but in Claude code you can spin up sub-agents to handle these big refactors, with the master context window just keeping track of the overall progress, bugs, etc and providing instructions to the subagents to do the rote work.

[-]

mrud 3 days ago ago

IMO yes. It is less polished but IMO the model is way better. I moved over from claude completely and cancelled my max subscription. Less polished, slower but the results are better and you have to do less steering

sgc 3 days ago ago

I not an expert ai user (and have never touched Codex), but anything remotely important I do, I force the smallest context window possible. I just did something very beautiful using that principle, which will soon be ready to show the world. It would have been a garbled pile of garbage with long context windows.

Obviously major architectural changes need a bigger context window. But try to aggressively modularize your tasks as much as you can, and where possible run batch jobs to keep your workflow moving while each task stays a smaller chunk.

enraged_camel 3 days ago ago

For complex refactors, I use "max mode" in Cursor, which in my experience noticeably improves the AI's performance and makes it go for a lot longer before it starts to drift. I haven't looked into how it works exactly, but it works well if you don't mind the extra cost.

[-]

whywhywhywhy 2 days ago ago

Had some bad experiences with max mode and the latest Claude spending significant time on writing worthless .md files rather than solving problems

2 days ago ago

[deleted]

drivingmenuts 2 days ago ago

Honestly, if Elon Musk told me what time it was, I wouldn't trust him.

behnamoh 3 days ago ago

Who here actually uses Grok? It's sad to see Elon's arc but when he doubled down on some of his political ideas he had it coming with the Tesla sales going down and x.ai not taken seriously.

I've always tried to remain apolitical and unbiased but it's hard to overlook who's behind a technology you wanna buy. Not that sama and others are saints either, it's just Elon's very obvious and vocal about it.

It's a shame, really, because Grok is a good model. But Elon promised to open source the previous model and it took them forever to do that with Grok 3. Sorry, but I wanna buy from someone who keeps their promises ("FSD by next year").

[-]

RobKohr 3 days ago ago

I like grok for noncoding stuff. I find it hasn't been tuned for "Safety" (meaning it isn't tuned much for political correctness). It also seems good at making images and stories up well. I run some choose your own adventures stories with my kids through it. We tell it who each of their characters are and what the theme is for the night and grok gives them each a section of story and 4 choices. They also have the option of choosing something different then suggested. We have it so it cycles around the turns for everyone. Works pretty well, and if the kids wanna go dark (preteen boy) grok doesn't mind the violence.

Kinda reminds me of the video game from enders game.

[-]

wqaatwt 3 days ago ago

> it isn't tuned much for political correctness

It was tuned to be edgy and annoying though (I mean his general style of speech not necessarily the content).

[-]

simondotau 2 days ago ago

Nothing in AI is more edgy and annoying than beginning every response with a mandatory glazing, like ChatGPT. “That’s a really insightful question, and shows that you really understand the subject!”

[-]

chownie 2 days ago ago

Nothing is more edgy than the AI being too polite? Are we just inventing new meanings for words?

[-]

simondotau a day ago ago

Politeness is not the same thing as gratuitous praise. Politeness is appropriate; being excessively glazed for asking an obvious follow-up question is weird.

2 days ago ago

[deleted]

razingeden 2 days ago ago

early iterations i could immediately peg as grok content based on its condescending snarky “OOoooOoOo — so much to unpack here sweaty, lets get started” tone.

im open minded and ive fed grok a few requests recently. it was better at doing creative fiction prompts without the “eddie izzard coming down off of a fifteen day coke bender” vibe.

everything i ask it to do is completely made up nonsense so i dont have an opinion about its bias or the quality of its factual content.

snark and clapback made the world go around on xitter. maybe thats what they thought people wanted. savage insulting content to “own” people. i for one, also found it extremely annoying.

vachina 3 days ago ago

[flagged]

[-]

wqaatwt 3 days ago ago

Where did you get that from?

‘Annoyed’ and ‘offended’ have a rather different meaning..

vlovich123 3 days ago ago

> meaning it isn't tuned much for political correctness

Is being tuned for right wing viewpoints the same as not being tuned for political correctness? Because there is tuning happening to a specific viewpoint:

https://gizmodo.com/elon-says-hes-working-to-fix-grok-after-...

[-]

gitaarik 3 days ago ago

Yeah, but you can argue that the AI has been biased because of biased training data.

Ultimately every AI is biased based on what you train it on and how you instruct it.

I tend to use LLMs from different companies and personally compare them, and read between the lines.

[-]

Yoric 3 days ago ago

> I tend to use LLMs from different companies and personally compare them, and read between the lines.

Read between the lines? Does this mean that you're using LLMs as a source of information?

wohoef 3 days ago ago

The point of LLMs is that there’s nothing in between the lines.

Or do you mean to say that you are trying to find the specific bias each model has?

minimaxir 3 days ago ago

Going off OpenRouter's rankings (https://openrouter.ai/rankings), Grok Code Fast 1 is the most used model by a significant margin, and since those metrics are calculated as of this week, that's after providers stopped giving free promotional access to it. Grok 4 Fast is #5 on that list which was never free.

In terms of models, Grok 4 Fast has essentially zero restrictions on safety, which a) makes it unusable for most applications that allow user input and b) makes it extremely useful for certain applications.

[-]

BoredPositron 3 days ago ago

It's the only model that lets you do gooner shit. That's why the usage is highly skewed. You can just call a horse a horse if you see one.

[-]

Squarex 3 days ago ago

this is a code model, not the general one

[-]

BoredPositron 3 days ago ago

you are so naive. lol. It's a general model with the tag "code" added to it.

jasonvorhe 3 days ago ago

This is nonsense. grok-code-fast-1 is just part of many free tiers of agentic coding assistants like Cline etc.

rjdj377dhabsn 3 days ago ago

For at least the last year, I've been using Grok for 90% of my queries. I pay for their $30 plan as well as $20 for Claude Code, which I only use for simple development projects. For anything more complicated, Grok's expert mode has consistently better results.

[-]

Sherveen 3 days ago ago

[flagged]

[-]

danskeren 3 days ago ago

I throw all my queries at Grok 4 Expert, GPT 5 Thinking and Opus 4.1 Extended Thinking.. for Golang it's been my experience that Grok produce the best results about 90% of the time as well.

Some simple example:

https://claude.ai/share/6d178173-cdf7-4e50-a467-73ee9f479d56.

https://chatgpt.com/share/69102735-46ac-8012-9cf0-0969585c86....

https://grok.com/share/bGVnYWN5LWNvcHk%3D_54b5f2f1-732e-4372....

I don't use Gemini but haven't been impressed whenever I tried it with GitHub Copilot.

[-]

f311a 3 days ago ago

Not sure about Claude, but OpenAI models are pretty bad at Go for some reason. For example, they always want to replace the "new" style for-loop that uses range with the old syntax. This drives me nuts.

vasco 3 days ago ago

I used to think OpenAI was going to be the Yahoo of the AI wave, but might not even be that, maybe it's the AOL.

And from what it looks like to me Google is preparing to be the Google of the AI wave.

[-]

GCUMstlyHarmls 3 days ago ago

Or maybe the Google Wave of the AI Wave...

thecopy 3 days ago ago

Isn't Sonnet 4.5 stronger than Opus 4.1?

raincole 3 days ago ago

In my experience Grok Fast is the best "cheaper" model out there. Far better than Haiku 4.5 and Gemini Flash. I don't think the other cheaper models should be treated seriously at this point.

[-]

behnamoh 3 days ago ago

Gemini Flash is the first model I disable in any tool I use. It's a joke, and to add salt to injury, google announced a "lite" version of that as well!

kelsolaar 3 days ago ago

As you point out, Sam Altman is not exactly an altar boy: https://fastcompany.co.za/business/2025-11-07-sam-altmans-tr...

[-]

andai 3 days ago ago

Thought this would be about the whistleblower. They didn't even mention it!

[-]

roman_soldier 3 days ago ago

Yes allegedly having an employee bumped off for whistleblowing and the sister thing is way worse than someone having a different opinion than you. One is criminal the other is free speech.

[-]

ramraj07 3 days ago ago

One is alleged, other isn't just an opinion. Its estimated that several hundred thousand deaths have already happened from the abrupt USAID cuts initiated by DOGE.

jamespo 2 days ago ago

"roman soldier" indeed

darkwater 3 days ago ago

I don't think you can compare the usual internal backstabbing between executives with someone who literally directed and participated in acts of the US Government, and keep saying and doing things to help and nurture a certain side of the political spectrum.

[-]

KingMob 3 days ago ago

Fair, but don't forget Altman's sister accused him of sexual abuse in court. (https://www.newsweek.com/sam-altman-openai-sister-annie-sexu...)

Dunno if it's true. The family wrote it off, saying she's mentally ill, but I can also see years of abuse leading to mental illness.

vasco 3 days ago ago

Both do both.

[-]

diputsmonro 3 days ago ago

Did Sam Altman lead a government agency and camp in the Oval Office for months too? Degrees matter.

wqaatwt 3 days ago ago

Not to an even remotely same degree..

LorenDB 3 days ago ago

I do! I have felt bad vibes from OpenAI for a while now, and eventually defaulted to Grok as somewhat the lesser of many evils. I respect anybody who doesn't wish to use it, but it's good enough for what I need it for. Case in point: it just spit out valid OpenSCAD code for an adapter piece I want to 3D print.

[-]

anon214535 2 days ago ago

I don't understand how anyone can think Grok is the lesser of many evils. It seems to me that Grok is currently playing in its own league of evil.

Most models belong to capitalist companies that are fairly apolitical and all they care about is money. Their evil comes from not caring about consequences as long as it grows their value. Their censorship come from the desire to avoid PR disasters.

On the other hand, Grok belongs to a billionaire involved in destroying America's democracy, and it's being openly manipulated according to Musk's ideology. I can't think of a model I would trust less.

Glamklo 3 days ago ago

[flagged]

dmead 3 days ago ago

[flagged]

[-]

ronsor 3 days ago ago

I find it funny that people are still calling Grok "mechahitler" as if that weren't prompted by trolls and the AI model is going to set up concentration camps on every block.

[-]

littlestymaar 3 days ago ago

> if that weren't prompted by trolls

It is, but the troll is the CEO playing with the system prompt…

Glamklo 3 days ago ago

[dead]

jeffhuys 3 days ago ago

You know, feel free to keep thinking this. In my experience Grok is the best. I don’t let myself into weird groupthink that happens just because trolls took advantage of Grok’s absence of lobotomy. Kind of a superpower.

LorenDB 3 days ago ago

I feel compelled to point out that the Mechahitler thing was prompted by bad actors hiding invisible tokens in tweets, but sure, it's maybe an unpopular opinion.

Basically, the major free options out there for LLMs are OpenAI, Google, Perplexity, DeepSeek, Meta, and Grok. (I could be missing stuff here, but those are the main players.) DeepSeek is out because of China ties. OpenAI and Perplexity have CEOs that seem incredibly shifty to me. I refuse to give Meta and Google any more info than I have to, so I'm avoiding them. Hence we fall back to Grok. Again, maybe not a completely logical progression, but it's my choice and I get to live with the consequences :)

[-]

wqaatwt 3 days ago ago

> CEOs that seem incredibly shifty to me

Yet the next level beyond “incredibly” somehow makes it alright again?

dmead 3 days ago ago

The best ones are out for... Reasons? This seems completely bad faith and honestly really Elon musk fanboyish.

Literally none of this options you listed are that objectionable.

Do what the rest of us do and switch frequently. Don't use mekafurhur and you'll be fine.

supriyo-biswas 3 days ago ago

I've been occasionally using Grok and found it good for devops stuff; specifically it often is able to explain and produce working configurations without getting lost or introducing subtle mistakes as I've sometimes seen with other models.

schappim 3 days ago ago

I used Grok to successfully split a large 10K-line file of spaghetti code into multiple smaller well organised files. This was after giving the same task to Claude, OpenAI, and Gemini, all of which consistently failed.

Grok certainly has its uses, but I default to OpenAI for most business tasks and Claude for code.

weird-eye-issue 3 days ago ago

> I've always tried to remain apolitical and unbiased

Clearly

[-]

decremental 3 days ago ago

[dead]

mudkipdev 3 days ago ago

I don't but only because the model is not satisfying, not because I dislike Tesla

Bender 2 days ago ago

I used it to calculate the size of a greenhouse using a lot of inputs and restrictions. It did that fine but the one thing I did not appreciate was its sense of humor. It said the excavator would be here first thing Monday along with a pot of coffee. Just tell me a dad joke or just skip the attempt at humor all together.

replwoacause 2 days ago ago

I won't go near anything Elon touches because of this. He's a clown.

XCSme 2 days ago ago

I use Grok 4 Fast via API, cheap, fast and almost really well suited for data parsing/extraction, a lot better than Gemini 2.5 Pro for example.

YetAnotherNick 3 days ago ago

Grok fast is by far the most used model in openrouter with more than a trillion tokens weekly[1].

[1]: https://openrouter.ai/rankings

[-]

behnamoh 3 days ago ago

Because some tools (AFAIR Kilo Code but I might be wrong) gave it away for free. The model itself was (still is?) free for a while, so I'm not surprised.

[-]

ribelo 3 days ago ago

Openrouter is not counting tokens used by Kilo or Cline. They have own endpoints.

[-]

wqaatwt 3 days ago ago

Yet if you go to the actual model’s page:

https://openrouter.ai/x-ai/grok-code-fast-1

Cline and Kilo code are in the top 3. So how does that work?

It’s considerably cheaper than competing models like 2.5 flash, though. So its not that surprising

[-]

YetAnotherNick 2 days ago ago

It doesn't include the free usage. There is a different model named grok code fast 1 free.

galaxy_gas 3 days ago ago

I have try it a few times in Copilot as code fast 1 because it was advertised. It has never correctly done something so far. Maybe because it's the fast ver ?

[-]

jasonvorhe 3 days ago ago

Maybe you just used it wrong? I refactored a complicated code base, built exhaustive tests for a CLI app and I've been maintaining and building out several k8s clusters out of a mono repo using Cline + grok-code-fast-1 and it's been a breeze.

Void_ 3 days ago ago

Half of USA voted for Trump. That should answer “who actually uses Grok”.

I personally use the best tool for the job, which Grok sometimes is.

[-]

aaronbrethorst 3 days ago ago

Trump received 77.3 million votes. Harris received 75 million votes. The US population is about 342 million.

[-]

herbst 3 days ago ago

I am not sure why these numbers would matter. He won, obviously, because the majority of voters voted for him.

Which are Americans, Americans who either voted for him and didn't do enough against him.

There is really no excuse to democratically vote for a person like this and let all this bullshit happen.

gitaarik 3 days ago ago

All propietary AIs are probably biased in some way. I mean, that is the power of them and the reason they're propietary, right?

So I tend to use different LLMs from different providers, personally compare them and read between the lines.

chistev 3 days ago ago

What models are better than Grok?

[-]

dymk 3 days ago ago

Sonnet-4 and onward, GPT-4 and onward

[-]

whywhywhywhy 2 days ago ago

Saying “GPT-4” is dishonest, launch GPt4 was significantly better than anything devday downgrade, all the 4o nonsense etc.

In reality GPT really sucked from devday until 5 and it redeemed itself

NaomiLehman 3 days ago ago

and GLM-4.6

apu6865i 3 days ago ago

Let me give you a perspective. For Indians Winston Churchill is no different than Hitler. The guy was responsible for millions of death in bengal famine.But for you and I assume majority of this forum and westerners he is a hero. Against Winston Churchill though Elon appears like a saint!

roman_soldier 3 days ago ago

At least Elon is open about what he believes. Other CEO's hide behind corporate PR machines, how do you know they are not psychopaths.

[-]

KingMob 3 days ago ago

> At least Elon is open about what he believes.

@dril: "you do not, under any circumstances, 'gotta hand it to them'"

sidibe 2 days ago ago

There's a nonzero chance they are not psychopaths. Elon reminds us daily about his chances

voganmother42 2 days ago ago

Yeah he was really open about his salute eh soldier?

sipsi 3 days ago ago

i didn't

wetpaws 3 days ago ago

[dead]

dynjo 3 days ago ago

[flagged]

jacquesm 3 days ago ago

[flagged]

whywhywhywhy 2 days ago ago

Groks underrated honestly. If you have to market on X you need a sub anyway so it’s replaced casual questions/sort of questions I used to Google for me and I’m not seeing anything worse than ChatGPT and often it’s better. Much better at current events.

The video gen is actually really good fast and cheap for short videos.

Still use Claude and GPT5 for work tasks but I haven’t tried grok extensively for those

Tycho 2 days ago ago

I use Grok more than other LLMs. It’s built into X, so the use case of pressing the Grok button on a post to see an explanation for something I didn’t understand, or a fact check for something I doubted, or just more background on a subject, is by far the most frequently useful feature of AI in my day to day life.

People seem to nitpick a lot. Grok 3 came out in, what, March? Cost how many tens of millions to train? And you’re mad because it’s not open source yet?

ronsor 3 days ago ago

This post really has no reason to be flagged. I know Elon is controversial, and I have a lot of gripes with his business practices myself, but this is literally just documentation for a frontier LLM. Can we stay on topic?

[-]

hu3 3 days ago ago

This. We like to think about ourselves as engineers. But often behave like a bunch of emotion driven primitives.

Honestly this kind of behaviour would be a huge red flag during interviews.

I have problems that current LLMs can't solve efficiently due to context window sizes. And welcome any improvement in this space.

[-]

autop0ietic 2 days ago ago

I personally can't stand Musk but for many he has become an Emmanuel Goldstein character that even the mention of his name causes the most extreme emotional disgust from all the exposure of this strange, algorithmic, Two Minutes Hate.

big-and-small 3 days ago ago

This. I wouldn't pay to use it, but big context windows are amazing for programming and especially prototyping when you can keep whole codebase in context.

Gemini's 1M is amazing.

ramraj07 3 days ago ago

Here's an on topic question: all the frontier model companies "promise" that they wont store and train on your api use if you pay for it. Who do you trust? I for sure will absolutely assume grok will just use the data I submit to train in perpetuity. Thats a scary thing for me and if anyone else does anything thats real work this should be great cause for worry if they wish to use grok.

[-]

pixel_popping 2 days ago ago

Do you really think Google isn't logging all our prompts?

[-]

ramraj07 a day ago ago

I will trust Google to abide by the rules more than any other big tech firm. Like with all my money ill make that bet. Not because I think they're good guys but from everything I have learned they have a culture that abides by rules like these. If they say they wont train on api use (they do say it) I feel assured they wont.

tastyface 2 days ago ago

He’s not “controversial,” he’s a far-right hate monger and Grok is part of his hate-mongering war machine. (Heck, the man spends half his social media time inciting civil war and whitewashing racist politicians.) No self-respecting “hacker” would spend a moment of their time on this pathetic excuse for technology. Fuck Grok.

ergocoder 3 days ago ago

[flagged]

[-]

poly2it 3 days ago ago

Are you implying that avoiding use of services controlled by a fascistoid oligarch is cope?

[-]

ergocoder a day ago ago

Not really.

If we are gonna avoid using the services by an oligarch, then just say that. No need to go through the 5 steps of denial.

I hope we are not against being honest.... but you can see a bunch of comments here going through those steps already.

oulipo2 3 days ago ago

The politics of the owners IS the topic. It's being really naive (read: stupid) to think that this has no implication on society

[-]

TheOtherHobbes 3 days ago ago

You're literally handing over your code to a third party.

In fact AI is handing over the process of creating code - eventually all code - to a small number of third parties, who will have complete power over the world's IT infrastructure.

No wonder they have wildly inflated valuations. The potential to enforce authoritarian policies through opaque technology is unprecedented.

bdangubic 2 days ago ago

Grok is not LLM, it is “not-so-large-take-out-what-Elon-doesnt-like LM” - no documentation necessary :)

raincole 3 days ago ago

It's funny how fast this post is flagged, lol. Have other LLMs or blunt ads got the same treatment on HN?

[-]

latexr 3 days ago ago

> Have other LLMs or blunt ads got the same treatment on HN?

Yes, I’ve seen it happen multiple times.

hereme888 3 days ago ago

[flagged]

[-]

jauntywundrkind 2 days ago ago

I believe those people are eager to discuss Musk. The people suppressing Musk discussion are the forces backing him, who are out here working to suppress inconvenient speakings.

[-]

hereme888 2 days ago ago

I honestly can't compute the logic of your statement.

jauntywundrkind 3 days ago ago

[flagged]

[-]

j3th9n 3 days ago ago

In what universe do you live.

nsoonhui 3 days ago ago

It's a shame that the top comments are focusing more on Elon Musk, his personality and politics rather than the quality of the model per se.

Speaking about Elon, regardless of what you think of him, he really does get things done, despite naysayers -- SpaceX, Tesla, Neuralink and even get Trump elected ( despite subsequent fallout) etc. Even Twitter is finding a second life by becoming a haven for the free speech advocates and alternative views, much to the chagrin of MSMs because they now no longer have the monopoly on the "truth", and censoring "fake news" becomes hard.

People like Elon are almost by definition contrarian ( you don't change the world by being a conformist), that should align well with the predilection of the intended audience here. So it's a surprise to me that HNs are almost uniformly, vehemently anti-Musk. It's almost as if the ultimate embodiment of the hacker spirit -- Musk -- is being rejected by his own kind, the very kind that he is supposed to inspire.

[-]

m-hodges 3 days ago ago

> Even Twitter is finding a second life by becoming a haven for the free speech advocates and alternative views, much to the chagrin of MSMs because they now no longer have the monopoly on the "truth"

Of all the silly things to say about Musk and Twitter, the idea that “MSM” are upset about Twitter is among the silliest.

wewewedxfgdf 3 days ago ago

>> regardless of what you think of him, he really does get things done, despite naysayers -- SpaceX, Tesla, Neuralink and even get Trump elected

It matters how people behave.

nextaccountic a day ago ago

> regardless of what you think of him, he really does get things done, despite naysayers -- SpaceX, Tesla, Neuralink and even get Trump elected

Is a billionarie getting a politician elected - even by promising payment to voters (that is, buying votes) - something positive?

The US is supposed to be a democracy, it's the people that get politicians elected, not billionaries

letmetweakit 3 days ago ago

In my understanding of the hacker ethos, hackers appear to be genuinely nice people who mean to do good for society and regular people. Elon does not align with those values according to some people so they reject him and his activities.

[-]

vachina 3 days ago ago

[flagged]

[-]

victorbjorklund 3 days ago ago

Accusing a cave diver who made Elon look stupid to be a pedophile just because Elon can’t stand people not thinking he is the smartest? Can give more examples.

finebalance 3 days ago ago

Killing USAID.

bertili 3 days ago ago

Besides the obvious right wing interference in politics, star link weaponization in some countries - how can anybody stomach the saving-humanity-agenda while running a major social media unresponsiveliy without caring of moderation, its consequences for real people?

[-]

vachina 2 days ago ago

I think the lack of moderation is a feature not a bug. People actually get to express themselves freely, very unlike the sterile feeling you get from mainstream social media, with content engineered for maximum engagement and political correctness for maximum ad revenue.

X doesn’t seem to care any of that.

jasonvorhe 3 days ago ago

Because someone's moderation is censorship to someone else. Begging Musk for free speech is another issue in itself though so you better don't bet on X allowing you to speak forever.

[-]

bertili 3 days ago ago

Free speech is one of these things that is always used as a trojan for doing ultimate good.

Let us empower anybody to say anything they want AND enforce everybody to have to listen to it.

Anonymous free speech is not free speech. There is no accountability. It should not should not be a human right. Its destroying our societies. The evidence should be clear by now.

oulipo2 3 days ago ago

[flagged]

[-]

458632058268 3 days ago ago

[flagged]

decremental 2 days ago ago

[dead]

oulipo2 3 days ago ago

[flagged]

latexr 3 days ago ago

[flagged]

[-]

testdelacc1 3 days ago ago

[flagged]

[-]

latexr 3 days ago ago

You’re replying to the wrong comment.

[-]

testdelacc1 2 days ago ago

[flagged]

thereitgoes456 3 days ago ago

[flagged]

[-]

aydyn 3 days ago ago

[flagged]

[-]

KingMob 3 days ago ago

But Tesla != Musk. He wasn't actually a founder, he bought his way in, and demanded that everyone agree he was a "founder".

Not to mention the huge numbers of real scientists working over the decades to improve battery tech to the point where it was obvious that electric cars were going to be viable.

We shouldn't praise Musk for taking credit for other people's work.

[-]

whywhywhywhy 2 days ago ago

Doesn’t matter, every normie thinks he is so his influence impacts Tesla for better or worse.

csomar 2 days ago ago

> he really does get things done

Really? Most of the stuff he promised never materialized. Elon's genius is that he learned where the money comes from. Both Tesla and Space X where financed by gov. money. That's why he supported Trump and that's why he keeps pumping the stock. He goes directly to the source.

ml-anon 3 days ago ago

[flagged]

[-]

ribelo 3 days ago ago

I give a shit, and I use it every fucking day.

[-]

oulipo2 3 days ago ago

[flagged]

[-]

458632058268 3 days ago ago

[flagged]

oulipo2 3 days ago ago

[flagged]

[-]

jasonvorhe 3 days ago ago

What a grandiose nuanced statement. Thanks for this highly enlightening contribution!

himujjal 3 days ago ago

I think this is against HN’s policies. @dang

[-]

angusturner 3 days ago ago

I thought exceptions tended to be made when its highly relevant to the technical topic at hand and also non controversial.

Outside a few weird online bubbles and pockets of the US, hardly anyone disputes the claim you are objecting to.

[-]

hu3 3 days ago ago

Regardless, it's just noise.

oulipo2 2 days ago ago

If you don't understand that politics, ideology and technology are deeply intertwined and they simply CANNOT be considered separately, you should start educating yourself

angusturner 3 days ago ago

[flagged]

straydusk 3 days ago ago

[flagged]

[-]

kburman 3 days ago ago

Not defending anyone, but by that logic, you shouldn’t be using any Chinese models either.

[-]

wewtyflakes 3 days ago ago

Yes, and do you believe they are?

snthpy 3 days ago ago

Why is that? Does Chinese company really equal Chinese government?

[-]

legacynl 3 days ago ago

Basically yes. China doesn't have a democracy, and it's government isn't bound by it's laws. If CCP thinks deepseek or any other product/tech can be a beneficial to Chinese strategy they will come knocking, and there's no denying whatever they demand. It can be backdooring, data harvesting, etc, there's really no saying how far they might go.

[-]

norman784 3 days ago ago

The same could be said of any LLM offered by either Chinese, European, US, etc.

sureglymop 3 days ago ago

On the other hand at least you can self host their models. My university now has an inference cluster for students and faculty to use open source models.

free_bip 3 days ago ago

Maybe if you're using their site directly, but what about the open models?

[-]

Yoric 3 days ago ago

Well, not using the site probably means that you're avoiding the mini-LLMs powdered before and after the main LLM to provide filters (including some layers of censorship) and the system prompt.

So I guess it depends on how deep the bias sits. And that is something that may vary with time. Grok has been a good example of this, with the bias initially being introduced as system prompts, then apparently moved to synthetic data used to train the further generations of Grok.

snthpy 3 days ago ago

Ok, good points. Thanks

nurettin 3 days ago ago

Probably, as much as US company equals US government.

retinaros 3 days ago ago

lol. is that really a question? even for american companies look who's at the board of those companies...

[-]

snthpy 3 days ago ago

Fair enough. I'm just sick of the reflexive anti-Chinese hysteria. I wouldn't want to live there personally and condemn the human rights abuses as much as the next guy. However in international politics it's clear who the two largest terrorist regimes have been over the last fifty years and yet they're still somehow held up as the good guys.

littlestymaar 3 days ago ago

True, though the the position of the CCP on Falun gong or Tiananmen square protests are much less likely to impact the life of a westerner than those of Elon.

alpineman 3 days ago ago

At least the Chinese models are open source, so you don't need to send money to the Chinese government to use them (unlike Grok 4, where you need to send money to Elon Musk)

[-]

kburman 3 days ago ago

“Open source” doesn’t mean “independent.” Most of those labs are state-linked and operate under laws that require compliance with party policy.

The CCP plays a long game, they want dependency, not donations. Once enough people adopt their stack, they’ll set the governance norms and compliance rules around it.

It’s not paranoia, it’s policy. Go read their New Generation AI Development Plan, they’ve been explicit about it since 2017.

[-]

Yoric 3 days ago ago

I agree with you. However, open weights (please let's not call these "open source", they are essentially binary blobs) are easier to fine-tune.

curiousgal 3 days ago ago

If I had to pick between a Chinese model and an American model based on the country's politics it would be China every time.

[-]

snthpy 3 days ago ago

I wouldn't go quite as far but overall i agree

aydyn 3 days ago ago

[flagged]

NaomiLehman 3 days ago ago

or American, using that logic

XorNot 3 days ago ago

I mean I don't, so...what's your point?

roman_soldier 3 days ago ago

[flagged]

[-]

TheOtherHobbes 3 days ago ago

That's because they don't have links to far right organisations around the world, they aren't running a social media site that aggressively promotes hate speech, they're not attempting to foment civil war in countries like the UK, and they don't say rambling grandiose crazy shit about colonising Mars or having self-driving cars real soon now which is clearly insane, makes no sense to anyone rational, and has a long record of being exaggerated and wrong.

None of this is hyperbole. All of it is historically documented.

jasonsb 3 days ago ago

[flagged]

snthpy 3 days ago ago

Yeah, same for OpenAI because of Sama. I'm proud to say I haven't touched either for over a year. There are enough good alternatives out there.

3 days ago ago

[deleted]

Lapel2742 3 days ago ago

[flagged]

[-]

gitaarik 3 days ago ago

And there are just as well people that would have no sympathy for people who seem to think it's ok to call people nazis for baseless and childish reasons.

[-]

FranzFerdiNaN 3 days ago ago

If it talks like a nazi and posts like a nazi and supports nazi parties in other countries and sieg heils on national tv like nazi, it probably is a (neo-)nazi.

farrighttroll 3 days ago ago

spoken like a true braindead

IshKebab 3 days ago ago

Come on, I would also never use Grok because Elon is an arsehole. He's not a neonazi though. That's ridiculous.

[-]

jofzar 3 days ago ago

Does the internet move so fast that people have forgotten his Roman salute at the trump rally?

[-]

wqaatwt 3 days ago ago

That proves he’s an edgy, drugged out jerk. Raising your hand in a certain way is not sufficient to be a nazi..

[-]

enraged_camel 3 days ago ago

The guy spends most of his time signal-boosting deeply racist, antisemitic, white supremacist stuff on X. He's obsessed with stuff like "replacement theory" and constantly insists white people must make as many babies as possible to maintain their cultural superiority and avoid being outnumbered by other races. You don't have to believe me. Go check for yourself.

[-]

Geee 3 days ago ago

It's extremely disrespectful to call the people from minority and diminishing cultures racist or white supremacist for protecting their own culture. Birth rate / demographic / cultural shifts are real problems. Elon has never talked about "white people" or their superiority. These issues have nothing to do with skin color. Same issues are faced by many asian countries as well.

[-]

jeffhuys 3 days ago ago

It’s called suicidal empathy.

enraged_camel 2 days ago ago

>> Elon has never talked about "white people" or their superiority

I would encourage you to try to avoid making such easily falsifiable claims, and put at least some token effort into your arguments. I was able to find the below with less than five minutes of searching.

https://www.timesofisrael.com/after-musk-prods-adl-says-kill...

Said tweet: https://x.com/elonmusk/status/1686037774510497792

He also endorsed an X post claiming "Jewish communities have been pushing [...] hatred against whites," calling it "the actual truth." https://www.cbsnews.com/news/elon-musk-antisemitic-comments-...

He has also repeatedly advanced a version of replacement rhetoric (e.g. claiming Democrats import immigrants to change power via the census), which is essentially a repackaging of the Great Replacement idea, i.e. a racist conspiracy centered on replacing white populations with those from other races and ethnicities. You can, for example, read the transcript of his interview with Don Lemon.

So yes, Elon does in fact frequently talk about white people. Even when not explicitly mentioning them, he means them. For example when he says people should have more babies, he specifically means white people: https://newrepublic.com/article/181098/elon-musks-weird-obse...

>> Same issues are faced by many asian countries as well.

I find your comparison of this issue to issues faced by various Asian countries to be pretty odd, as it does not stand up to critical scrutiny. Asian countries' demographic crises are about internal low fertility and rapid aging, not about being "replaced" by outsiders. Indeed, the arithmetic makes the comparison impossible: Japan, China, South Korea all have extremely tiny foreign populations. Therefore, pointing to Japan/Korea/China's low birth rates to sanitize "replacement" talk is a bad-faith pivot.

[-]

Geee 2 days ago ago

Demographic change can be caused by just a low birth rate, which is more of an economic issue, but it can also be combined with immigration, which may result in changing culture, i.e. "replacement". This issue is currently mostly faced by people who are white Europeans, but these people also represent many different local cultures. Not to mention that Japan and South Korea have also been increasing their immigration, although it has been quite low so far.

All kinds of people have equal right to defend their own culture. It doesn't mean that they're supremacist or racist, even if they think that their culture is better than some other culture. It's only supremacist if it aims to destroy, repress or subject other people, by advocating discrimination and violence.

Thus, "make more white babies" is not supremacist or racist. As isn't calling out violence against white people in South Africa.

wqaatwt 3 days ago ago

I do. You’re certainly right. I just don’t like reducing the entire political compass to “marxist/smth.” and “nazis”. That both devalues these words and and leaves out any nuance.

Guys like Trump or Putin are not nazis (yet). They do resemble Mussolini on various and often quite deep levels. So fascist would probably be the more correct term.

As for Musk I’m not sure. Drugs and whatever mental issues he’s suffering from likely are distorting the real picture (which might also be even darker).

[-]

enraged_camel 3 days ago ago

My philosophy is "if it walks like a duck, and quacks like a duck, it's a duck". Is there a chance that it's not a duck? Sure. Does it matter? Not unless you're a duck scientist. Ultimately I find that there is very little to gain from thinking hard about whether they are just Nazi boosters/sympathizers or actual Neo-nazis, because in practical terms it makes virtually no difference.

[-]

wqaatwt 3 days ago ago

I do see your point about how avoiding thinking hard leads to seeing virtually no difference between various somewhat nuanced topics.

Oversimplifying everything, reducing complexity into simple catchphrases and extreme cognitive dissonance is what the “other” side is all about. Adopting their overall approach seems somewhat counterproductive longterm…

[-]

enraged_camel 3 days ago ago

I recognize the nuance, I just don't think there's anything to gain from trying to understand whether Elon is a Neo-nazi or a more generic white supremacist because, practically speaking, doing so adds no value to my personal or professional life. On the other hand, if my job was to write a book about Elon, I would be compelled to dive much deeper into it.

[-]

IshKebab 2 days ago ago

There is something to gain, because when you call him a neo-Nazi and he obviously isn't, then nobody will engage with your arguments because you're very obviously wrong.

If you say he seems to be racist and supports lots of far-right groups that are overtly racist... then people can't just ignore you.

wqaatwt 3 days ago ago

There is if you want to understand why 20-40% of the population across a multitude of countries support people like him.

You can brand them all nazis and shut off the entire conversation. That might be the “morally righteous” thing to do (not sarcasm). What’s that point of that though? You still have to live with them in the same country and vote in same elections.

FranzFerdiNaN 3 days ago ago

It’s amazing how far people are willing to go to ignore the evidence right in front of their eyes.

Lapel2742 3 days ago ago

[flagged]

[-]

wqaatwt 3 days ago ago

So let’s focus on the specifics things he does and says. Instead of compressing everything into a single word (which unfortunately lost much of its meaning and has been massively devalued).

Nazism is a specific, relatively, defined and extremely dark ideology. If we apply it to every unhinged off brand pseudo-fascist it can really distort the views people might start having about the original ideology.

[-]

realusername 3 days ago ago

I don't see what else you would need, he did the nazi salute, supported nazi parties and ideologies financially and publicly, what else would be needed?

Let's be real, he's only getting a pass because it's Musk.

Lapel2742 3 days ago ago

That's why I wrote "neo-nazi" and not Nazi. I'm German and I listened to all the stories my grandparents told me about the Nazi reign. In my my opinion he is f'ing Neo-nazi and if he isn't he behaves like one which in the end is the same.

You know what I learned about the Nazis from my Grandparents? Not every German was a Nazi. In fact just short above 50% voted for them. And the Nazis didn't start with war and Auschwitz. That was peak Nazi. They started with “Awake, Germany!” and river cruises.

The Holocaust was also possible because far too many people who were essentially decent turned a blind eye and found excuses for the Nazis until it was too late.

[-]

wqaatwt 3 days ago ago

The thing is that he isn’t behaving like actual nazis from the 30s. He and people like him are doing and trying to accomplish something quite different (although also horrible). What’s happening today and its causes is only superficially similar to the situation in the 20s and 30s. Dismissing, ignoring and oversimplifying it (as appealing that is to some people) is counterproductive.

> And the Nazis didn't start with war and Auschwitz

No, but they were reasonable transparent about who they were well before that.

Problem is that the majority (or pretty close) of these “essentially decent” people never supported democracy or what it stood for. Weimar Republic never stood a chance in a society where most people supported rabid jingoism and authoritarianism in general.

ilikehurdles 3 days ago ago

There’s a candidate with a literal Nazi symbol tattooed on his chest and not one democrat condemned him or demanded he drop out. Guess his party affiliation.

So let’s just be clear that nobody is playing this fake outrage game anymore.

[-]

FranzFerdiNaN 3 days ago ago

Multiple people can be nazi shitheads.

monero-xmr 3 days ago ago

NOOOO! Having a nazi death head tattooed on your body is because he didn’t known what it was! How dare you

khimaros 3 days ago ago

citation needed

lm28469 3 days ago ago

Why is he licking Israel boot is he's a neo nazi ?

It's almost as if being a piece of shit doesn't immediately make you a nazi, we should move on from ww2 era lingo, new things need new terms. When everyone is a fascist and a nazi no one is, and it weakens the original terms to the point of them being meaningless

[-]

FranzFerdiNaN 3 days ago ago

https://en.wikipedia.org/wiki/Zionist_antisemitism

misnome 3 days ago ago

Rule of goats

delusional 3 days ago ago

He did heil a couple of times. And created mechahitler. And lied about "white genocide" in South africa. And called a white supremacist talking point "the actual truth"

We should be careful of labeling people Nazis, but Elon does seem to be playing on the wrong side of that fence.

herbst 3 days ago ago

Central European here. If it quaks like a nazi that's exactly what we would call a neo nazi.

roman_soldier 3 days ago ago

[flagged]

[-]

Lapel2742 3 days ago ago

> [...] infected with the woke mind virus [...]

I won't argue other than telling you: That's peak American infighting and I'm not American. I leave a citation of Arthur Schopenhauer, a famous philosopher, here. Maybe it betters your condition:

“The cheapest sort of pride is national pride; for if a man is proud of his own nation, it argues that he has no qualities of his own of which he can be proud; otherwise he would not have recourse to those which he shares with so many millions of his fellowmen. The man who is endowed with important personal qualities will be only too ready to see clearly in what respects his own nation falls short, since their failings will be constantly before his eyes. But every miserable fool who has nothing at all of which he can be proud adopts, as a last resource, pride in the nation to which he belongs; he is ready and glad to defend all its faults and follies tooth and nail, thus reimbursing himself for his own inferiority.”

jeffhuys 3 days ago ago

[flagged]

solumunus 2 days ago ago

Grok? Next…

[-]

bushbaba 2 days ago ago

I personally find grok better for certain tasks. It’s better than Gemini for images. Its better than the rest at crude jokes etc

tacker2000 2 days ago ago

Yea, no desire to ever use this.

vaxman 2 days ago ago

OpenAI will go to zero unless it agrees to be acquired because they're messing with public company stock valuations using funky purchase orders leaving those public companies no choice but to cancel their credit (at least unless they get a "government backstop" that they say they don't want or need). Those who compete with OpenAI will also "take a hit" if/when that happens, so they would be wise to be looking to make a deal to acquire OpenAI. Dude was from Y Combinator and liked to bank on hope, focusing on capturing market share and worrying about profits later, which is fine in software startups playing with Monopoly money, but when it impacts vendors that are publicly traded companies (to the point that one is now valued at $5T), post-1929 rules come into play. Anthropic has a similar issue, but there, the issue is that their C-suite is making outrageous public statements that are suspected of intending to manipulate the stock values of both private and public competitors and of the publicly held vendors to all of these players. I hope they both go away quietly and someone declares victory rather than the stock market crashing!

As far as xAI, I doubt it will go to zero or run afoul of any of those market manipulation issues because it owns Twitter/X and I think it powers the realtime Tesla cloud, but betting on it is fraught with peril because of the high likelihood that it will wind up under the control of some less capable conglomerate (ergo, GM acquisition of Hughes Aircraft and resale to Raytheon, Boeing and News/DirecTV).

Google, Meta, a handful of B actors and China are where we have to place our bets, but only if we ourselves need (or want to invest on the theory that others need) trillion parameter models (and want to risk having the valuations lowered if/when adverse actions are taken against the above competitors).

[-]

vaxman a day ago ago

*-"eventually" leaving those public companies no choice but to...

Clarifying, because there's no way a company (public or private) is going to reduce the credit line of a major customer until it's obvious that the orders "aren't real" But if Wall Street realizes it before they do, they can lose control of their business too. This is not quite Enron or WorldCom/MFS, but it's a very similar storm on the horizon. (BTW, ever wonder why Sprint never could remain airborne and eventually was merged with TeenMobile? It's because they overspent on CapX trying to keep up with the fraud at Worldcom and could never dig out to actually use all that spectrum. Likewise, we are still dealing with the fallout of the Enron collapse on the US domestic energy grid a quarter century later.)

johnnyApplePRNG 3 days ago ago

But for some reason if I load a 400kb file into it... it can't even read the file?! Pffft, whatever elon. Go play with your rockets.

Grok 4 Fast now has 2M context window