Raspberry Pi's New AI Hat Adds 8GB of RAM for Local LLMs

(jeffgeerling.com)

195 points | by ingve 8 hours ago ago

32 comments

  • buran77 7 hours ago ago

    I think Raspberry lost the magic of the older Pis, they lost that sense of purpose. They basically created a niche with the first Pis, now they're just jumping into segments that others created and are already filled to the brim with perhaps even more qualified competition.

    Are they seeing a worthwhile niche for the tinkerers (or businesses?) who want to run local LLMs with middling performance but still need full set of GPIOs in a small package? Maybe. But maybe this is just Raspberry jumping on the bandwagon.

    I don't blame them for looking to expand into new segments, the business needs to survive. But these efforts just look a bit aimless to me. I "blame" them for not having another "Raspberry Pi moment".

    P.S. I can maybe see Frigate and similar solutions driving the adoption for these, like they boosted Coral TPU sales. Not sure if that's enough of a push to make it successful. The hat just doesn't have any of the unique value proposition that kickstarted the Raspberry wave.

  • dwedge 7 hours ago ago

    > In practice, it's not as amazing as it sounds.

    8GB RAM for AI on a Pi sounds underwhelming even from the headline

  • t43562 6 hours ago ago

    In the UK I've never seen the hailo hats (which are quite old BTW) advertised for LLMs. The presented usecase has been object detection from lots of video cameras in realtime.

    They seem very fast and I certainly want to use that kind of thing in my house and garden - spotting when foxes and cats arrive and dig up my compost pit, or if people come over when I'm away to water the plants etc.

    [edit: I've just seen the updated version in Pimonori and it does claim usefulness for LLMs but also for VLMs and I suspect this is the best way to use it].

  • djhworld 2 hours ago ago

    A few years ago this product would have just been called an ML Accelerator and marketed as helping accelerate ML workloads like object detection in images

    Hitching their wagon to the AI train comes with different expectations, leading to a mixed bag of reviews like this.

  • cmpxchg8b 7 hours ago ago

    8GB? What is this, an LLM for ants?

  • Barathkanna 6 hours ago ago

    As an edge computing enthusiast, this feels like a meaningful leap for the Raspberry Pi ecosystem. Having a low-power inference accelerator baked into the platform opens up a lot of practical local AI use cases without dragging in the cloud. It’s still early, but this is the right direction for real edge workloads.

  • agent013 7 hours ago ago

    A good illustration of how “can run LLM” ≠ “makes sense to run LLM”. A prime example of how numbers in specs don’t translate into real UX.

  • syntaxing 2 hours ago ago

    Interesting idea. I think the Jetson Orin Nano is a better purchase for this application. The main downside is the RAM is shared so you lose about 1G from the OS overhead.

  • endymion-light 6 hours ago ago

    can't wait to not be able to buy it, and also for it to be more expensive than a mini-computer

    I buy a raspberry pi because I need a small workhorse - I understand adding RAM for local LLMs, but it would be like a raspberry pi with a GPU, why do i need it when a normal mini machine will have more ram, more compute capacity and better specs for cheaper?

  • speedgoose 7 hours ago ago

    Is there any usefulness with the small large language models, outside perhaps embeddings and learning?

    I fail to see the use-case on a Pi. For learning you can have access to much better hardware for cheaper. Perhaps you can use it as a slow and expensive embedding machine, but why?

  • yjftsjthsd-h an hour ago ago

    Any chance you can split layers so that some run on the CPU and some run on this board to let you run bigger models and/or get better performance?

  • joelthelion 5 hours ago ago

    8GB is really low.

    That said, perhaps there is a niche for slow LLM inference for non-interactive use.

    For example, if you use LLMs to triage your emails in the background, you don't care about latency. You just need the throughput to be high enough to handle the load.

  • JustFinishedBSG 3 hours ago ago

    It's useless for LLMs and it's actually slower than Hailo 8H for standard vision tasks, so, why ?

  • phito 7 hours ago ago

    Sounds like some PM just wanted to shove AI marketing where it doesn't make sense.

  • myrmidon 3 hours ago ago

    Are there significant usecases for the really small LLMs right now (<10b distills and such)?

    My impression so far was that the resulting models are unusably stupid, but maybe there are some specific tasks where they still perform acceptably?

  • nottorp 4 hours ago ago

    Hmm. Can this "AI" hardware - or any other "AI" hardware that isn't a GPU - be used for anything other than LLMs?

    YOLO for example.

  • Lio 5 hours ago ago

    I've seen the AI-8850 LLM Acceleration M.2 Module advertised as an alternative RPi accellorator (you need an M.2 hat for it).

    That's also limited to 8Gb RAM so again you might be better off with a larger 16Gb Pi and using the CPU but at least the space is heating up.

    With a lot of this stuff it seems to come down to how good the software support is. Raspberry Pis generally beat everything else for that.

  • giantg2 2 hours ago ago

    What not use a USB Coral TPU? Seems to do mostly the same stuff and is half the price.

  • xp84 an hour ago ago

    Gigabytes?? In THIS economy?

  • yawniek 2 hours ago ago

    i wonder how the Hailo 10H compares to Axera AX8850. add on boards seem to be cheaper and its a full SoC that can also draw much more power.

  • wyldfire 5 hours ago ago

    I wonder -- how does this thing compare against the Rubik Pi [1]?

    [1] https://rubikpi.ai/

  • incomingpain 2 hours ago ago

    I wonder if this is the magic hardware for LiquidAI/LFM2.5-Audio-1.5B

    Dont need more than 8gb. It'll be enough power. IT can do audio to audio.

  • renewiltord 7 hours ago ago

    What’s the current state of the art in low power wake word and speech to text? Has anyone written a blog post on this?

    I was able to run a speech to text on my old Pixel 4 but it’s a bit flaky (the background process loses the audio device occasionally). I just want to take some wake word and then send everything to remote LLM and then get back text that I do TTS on.

  • huntercaron 7 hours ago ago

    Glad Jeff was critical here they need a bit of a wake up call it seems.

  • esskay 6 hours ago ago

    What a pointless product to waste time making.

  • rballpug 6 hours ago ago

    Catalog TG 211, 1000 Hz.

  • teekert 6 hours ago ago

    At this moment my two questions for these things are:

    1. Can I run a local LLM that allows me to control Home Assistant with natural language? Some basic stuff like timers, to do/shopping lists etc would be nice etc.

    2. Can I run object/person detection on local video streams?

    I want some AI stuff, but I want it local.

    Looks like the answer for this one is: Meh. It can do point 2, but it's not the best option.

  • imtringued 5 hours ago ago

    This looks pretty nice for what it is. However, the RAM is a bit oversized for the vast majority of applications that will run on this, which is giving a misleading impression of what it is useful for.

    I once tried to run a segmentation model based on a vision transformer on a PC and that model used somewhere around 1 GB for the parameters and several gigabytes for the KV cache and it was almost entirely compute bound. You couldn't run that type of model on previous AI accelerators because they only supported model sizes in the megabytes range.

  • moffkalast 7 hours ago ago

    > The Pi's built-in CPU trounces the Hailo 10H.

    Case closed. And that's extremely slow to begin with, the Pi 5 only gets what, a 32 bit bus? Laughable performance for a purpose built ASIC that costs more than the Pi itself.

    > In my testing, Hailo's hailo-rpi5-examples were not yet updated for this new HAT, and even if I specified the Hailo 10H manually, model files would not load

    Laughable levels of support too.

    As another datapoint, I've recently managed to get the 8L working natively on Ubuntu 24 with ROS, but only after significant shenanigans involving recompiling the kernel module and building their library for python 3.12 that Hailo for some reason does not provide outside 3.11. They only support the Pi OS (like anyone would use that in prod) and even that is very spotty. Like, why would you not target the most popular robotics distro for an AI accelerator? Who else is gonna buy these things exactly?

  • kotaKat 6 hours ago ago

    "For example, the Hailo 10H is advertised as being used for a Fujitsu demo of automatic shrink detection for a self-checkout."

    ... why though? CV in software is good enough for this application and we've already been doing it forever (see also: Everseen). Now we're just wasting silicon.

  • Havoc 4 hours ago ago

    That seems completely and utterly pointless.

    A NPU that adds to price but underperforms a rasp cpu?

    You get SBC with 32gb ram…

    Nevermind the whole minipc ecosystem which will crush this

  • vander_elst 4 hours ago ago

    I had a couple of Pis that I wanted to use as a Media center, I always had some small issues that created a suboptimal experience. Went for a regular 2nd hand amd64 with a small form factor and never looked back, much better userspace support and for my use case a much smoother experience, no lags no memory swapping and if needed I can just buy a different memory bank or a different component. I have no plans to use a raspberry pi any time soon. I am not sure these days if they really still have a niche to fill and if yes how large this niche is.