Unix Reimagined for AI – Iterative Coding in a Shell Loop

(linuxtoaster.com)

1 points | by dirk94018 2 hours ago ago

1 comments

dirk94018 2 hours ago ago
Author, got annoyed by Python startup times, frameworks, orchestration layers, and the general state of AI tooling. We write everything in C/C++. Toast overhead is ~20ms per invocation — that's what makes the loop practical - toastd does https connection pooling. With Cerebras it can run at ~2000 tok/s. Local toasted gets ~100 tok/s with 0.6s time-to-first-token. Happy to answer questions.