Show HN: Dust – Device Unified Serving Toolkit (CUDA for Phones)

(rogelioruiz.github.io)

3 points | by ruizprogelio 6 hours ago ago

1 comments

ruizprogelio 6 hours ago ago
I built this on my own from Mexico. I taught myself, have no degree, and I’ve been creating things on the internet for about 10 years now.
The issue is that if you want to run ML models on a phone right now, you only have inference runtimes like TFLite, CoreML, and ONNX Runtime. Nobody takes care of the other stuff, such as downloading models, verifying them, caching sessions, and clearing memory when the phone is under stress. Developers end up rebuilding all of this from scratch. I kept hitting this wall, so I created the missing layer.
DUST includes five packages:
- dust-core: contracts and a service registry (ModelServer, ModelSession, VectorStore, EmbeddingService) - dust-serve: downloads with SHA-256 verification, session caching with reference counting, LRU eviction, and accelerator probing - dust-llm: llama.cpp on mobile, Metal on iOS, JNI on Android, streaming, and chat templates - dust-onnx: ONNX Runtime on mobile - dust-embeddings: on-device vector generation
All packages are built with native Swift and Kotlin. There are also Capacitor wrappers for web-based mobile apps. You can use whichever layer you need.
I’ve been building an on-device AI agent ecosystem separately (capacitor-mobile-claw), which reached 7,200 npm installs in two weeks without any promotions. Right now, it relies on cloud APIs. Once on-device models improve enough, DUST will make the switch possible.
All commits are co-authored by Claude—check the git history. The architecture and design choices come from about 10 years of experience in backend development, Kubernetes, and embedded systems.
I’m happy to answer any questions about the architecture.