Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
...
Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
Large Context Window: Supports a 256k context window.
My wildly uneducated guess is that they are getting to the point where they need to figure out how to profit off all this investment, and releasing self-hosted open-source models isn’t going to help them do that.
HN only covers a very small slice of interesting things that happen in tech every day. If it's your only source of tech news and information, you are missing out on a LOT.
There are plenty of self-hosted models being released all the time, they just don't make it to HN. For that, you need to find a community that is passionate about testing and tinkering with self hosted models. A very popular one is "/r/localllama" on Reddit, but there are a few others scattered around.
Could you recommend other sites? I only use HN exclusively but would be keen on decent tech new sites without having to sieve through the sludge of Google.
Ollama has changed. Early versions were raw, and then they were optimized (I’m on a laptop with 64GB RAM), and then they fell to shit. Optimized for someone else’s home rig I suppose.
And my old favorite models broke so I have to link different versions. nous-hermes2-mixtral I miss your sage banter.
Investors need everyone to avoid self-hosted models and pay premium subscriptions for large centralized models, else they will never earn the profits they want. Self-hosted models spoil their revenue forecasts.
One thing that happened was the providers got better at hosting smaller and cheaper models. So you could self host or just get your work done with GPT 5 nano.
They're still going. I just bought a 5090 for myself this Christmas to do more interesting things.
I mostly use them for game assets.
Trellis2 is very cool. Ive managed to put together a sdxl -> trellis -> unirig pipeline to generate 3d characters with mixamo skeletons that's working pretty well.
On the llm front, deepseek and qwen are still cranking away. Qwen3 a22b instruct, imho does a better job than gemini in some cases with ocr and translation of handwritten documents.
The problem with these frontier open weight models is that running them locally is not exactly tenable. You either have to get a cloud GPU instance, or go through a provider.
A recent local model I tried is Ministral 3 from a month ago. https://mistral.ai/news/mistral-3
My wildly uneducated guess is that they are getting to the point where they need to figure out how to profit off all this investment, and releasing self-hosted open-source models isn’t going to help them do that.
Possibly, but it's not just the release of new models. It seems the community itself has lost interested in self-hosted models.
HN only covers a very small slice of interesting things that happen in tech every day. If it's your only source of tech news and information, you are missing out on a LOT.
There are plenty of self-hosted models being released all the time, they just don't make it to HN. For that, you need to find a community that is passionate about testing and tinkering with self hosted models. A very popular one is "/r/localllama" on Reddit, but there are a few others scattered around.
Could you recommend other sites? I only use HN exclusively but would be keen on decent tech new sites without having to sieve through the sludge of Google.
TheRegister, SlashDot and hackaday I know of.
Ollama has changed. Early versions were raw, and then they were optimized (I’m on a laptop with 64GB RAM), and then they fell to shit. Optimized for someone else’s home rig I suppose.
And my old favorite models broke so I have to link different versions. nous-hermes2-mixtral I miss your sage banter.
Now everything runs on an excessive lag.
Investors need everyone to avoid self-hosted models and pay premium subscriptions for large centralized models, else they will never earn the profits they want. Self-hosted models spoil their revenue forecasts.
One thing that happened was the providers got better at hosting smaller and cheaper models. So you could self host or just get your work done with GPT 5 nano.
there are tons of models released still. even some non-Qwen ones!
I have no idea
There are a lot of local models being released every week. You really need to log into /r/localllama to stay up to date.
They're still going. I just bought a 5090 for myself this Christmas to do more interesting things.
I mostly use them for game assets.
Trellis2 is very cool. Ive managed to put together a sdxl -> trellis -> unirig pipeline to generate 3d characters with mixamo skeletons that's working pretty well.
On the llm front, deepseek and qwen are still cranking away. Qwen3 a22b instruct, imho does a better job than gemini in some cases with ocr and translation of handwritten documents.
The problem with these frontier open weight models is that running them locally is not exactly tenable. You either have to get a cloud GPU instance, or go through a provider.
- https://github.com/microsoft/TRELLIS.2 - https://github.com/VAST-AI-Research/UniRig