AI agent with 2 deps that uses Shannon Entropy to decide when to act vs. ask

(github.com)

2 points | by borhensaidi 11 hours ago ago

2 comments

borhensaidi 11 hours ago ago
I got frustrated with LangChain being impossible to audit (500+ transitive dependencies, 100K+ LOC), so I built picoagent — an AI agent framework with only numpy and websockets as external dependencies.
The interesting technical decision: instead of prompting the LLM to pick a tool, I use Shannon Entropy (H(X) = -Σp·log₂(p)) on the softmax score distribution over available tools. If entropy is above 1.5 bits, the agent asks for clarification instead of guessing. In my tests this cuts false positive tool calls by 40-60%.
The threshold adapts over time using a simple online learning system that tracks success/failure rates per session — no external data sent anywhere.
Other things that might be interesting to HN: - Zero-trust sandbox with 18+ regex deny patterns blocking rm -rf, fork bombs, sudo, reverse shells, path traversal - Dual-layer memory: numpy .npz vector embeddings + LLM consolidation to MEMORY.md (no Pinecone, no vector DB) - The entire entropy gate is 64 lines of readable Python - 5 chat channels (Telegram, Discord, Slack, WhatsApp, Email) with unified memory - MCP-native (Model Context Protocol) stdio server - Hot-reloadable Markdown skills via SIGHUP
It's early and rough. I'm looking for feedback on: - Is 1.5 bits the right entropy threshold or should it be dynamic from day one? - What dangerous shell patterns am I missing in the sandbox? - Is the dual-memory approach (vector + markdown consolidation) worth the complexity?
GitHub: https://github.com/borhen68/picoagents
Happy to answer questions about any of the implementation decisions.
guerython 11 hours ago ago
Solid move on the entropy gate. We log the softmax H for every tool call and keep a tiny EMA+stddev per tool (`H_new=(1-α)H_old+αH_now`). The gate then lets calls through only when `H < max(base, mean+2σ)` and resets the mean when we see two consecutive confirmed failures, so the threshold drifts with the workload instead of hardcoding 1.5 bits.
On the sandbox side, we blocked not just `rm -rf`/fork bombs but also `os.execve('/proc/self/exe')`, `chmod`/`chown` on symlinks under `/tmp`, and we intercept raw `socket/connect` via ptrace so no new outbound channels spawn even if a regex slips. These traps stopped most of the pivoting tricks we saw in the first week.