> Their models, such as Sarvam 2B and Sarvam-M, are fine-tuned for medical reasoning and symptom triage in local languages, without the need for high-end devices or constant internet. These systems can summarize patient notes, offer diagnostic guidance and even prioritize cases, functioning as low-cost, frugal AI assistants for overstretched healthcare workers.
Wow bad idea. Domain specific models simply don’t work. Ever. You should not be using some shoddy 3M model for medical purposes when you can spend just a few dollars extra and get GPT that is miles and miles better. The local language value proposition is also exaggerated.
This article keeps repeating the lie that network is hard to find in India and that local models win. This is on the face ridiculous to anyone who has been to India. Almost everyone has access to a smartphone with 4g connection. What they don’t have is the ability to afford a phone that can run a good model. Why would I as a poor farmer in India, use an extremely underpowered 3B model on my 100 dollar smartphone when I can use the free version of ChatGPT that is miles ahead in every dimension?
My 1000 dollar iPhone can barely run Gemma 4 which is hardly usable for serious questions anyway.
I do get the need for Indian ecosystem to build internal competency so that when the time comes they are prepared. But for now pursuing a distillation attack strategy like China looks better. Or have companies that specialise in integration locally - something big model companies don’t have expertise in.
This is incorrect because LLMs don’t have the same property as that of other tools. LLMs enjoy compounding effect of intelligence which you can’t get my separating them. If what you said was right then we would see more open weight domain model. We don’t and there’s a reason why.
its correct because im using relative sizing and you are using absolute sizing, and completely ignoring the AI Bloat belief that we just need larger and larger models.
Odd of them to mention Krutrim given that it's just open source models hosted on India-based hardware plus a few of their own.
> Their models, such as Sarvam 2B and Sarvam-M, are fine-tuned for medical reasoning and symptom triage in local languages, without the need for high-end devices or constant internet. These systems can summarize patient notes, offer diagnostic guidance and even prioritize cases, functioning as low-cost, frugal AI assistants for overstretched healthcare workers.
Wow bad idea. Domain specific models simply don’t work. Ever. You should not be using some shoddy 3M model for medical purposes when you can spend just a few dollars extra and get GPT that is miles and miles better. The local language value proposition is also exaggerated.
This article keeps repeating the lie that network is hard to find in India and that local models win. This is on the face ridiculous to anyone who has been to India. Almost everyone has access to a smartphone with 4g connection. What they don’t have is the ability to afford a phone that can run a good model. Why would I as a poor farmer in India, use an extremely underpowered 3B model on my 100 dollar smartphone when I can use the free version of ChatGPT that is miles ahead in every dimension?
My 1000 dollar iPhone can barely run Gemma 4 which is hardly usable for serious questions anyway.
I do get the need for Indian ecosystem to build internal competency so that when the time comes they are prepared. But for now pursuing a distillation attack strategy like China looks better. Or have companies that specialise in integration locally - something big model companies don’t have expertise in.
Remember: Linux always believed in the small tools hooked into one another can make the most effective.
All these capitalist funded AI models with bloated hardware requirements
This is incorrect because LLMs don’t have the same property as that of other tools. LLMs enjoy compounding effect of intelligence which you can’t get my separating them. If what you said was right then we would see more open weight domain model. We don’t and there’s a reason why.
its correct because im using relative sizing and you are using absolute sizing, and completely ignoring the AI Bloat belief that we just need larger and larger models.
Easily disprovable: Codex is a model made specifically for coding. It still knows about random movies and economics. Try it. Why does it know?
s/Linux/Unix/