Some context on why we built this: you might have seen the post earlier this week about someone building a file recovery tool after Claude Code rm -rf'd their Obsidian vault through a symlink. We had similar near-misses running our own agent swarm, agents curling cloud metadata endpoints, attempting path traversal, executing destructive commands during "cleanup" steps. We kept adding one-off guards and eventually realized this should be a proper library.
The main design choice was making it deterministic rather than using an LLM to review tool calls. An LLM guarding another LLM felt like asking the fox to guard the henhouse. Pattern matching is boring, but it's fast, predictable, and works offline.
Happy to hear about false positives, missing threat categories, or use cases where the rule set is too aggressive. That's the main thing we want to calibrate for v0.2.
Author here, happy to answer any questions.
Some context on why we built this: you might have seen the post earlier this week about someone building a file recovery tool after Claude Code rm -rf'd their Obsidian vault through a symlink. We had similar near-misses running our own agent swarm, agents curling cloud metadata endpoints, attempting path traversal, executing destructive commands during "cleanup" steps. We kept adding one-off guards and eventually realized this should be a proper library.
The main design choice was making it deterministic rather than using an LLM to review tool calls. An LLM guarding another LLM felt like asking the fox to guard the henhouse. Pattern matching is boring, but it's fast, predictable, and works offline.
Happy to hear about false positives, missing threat categories, or use cases where the rule set is too aggressive. That's the main thing we want to calibrate for v0.2.
[dead]