Show HN: Detecting LLM hallucinations in <1ms using hidden states (RTX3050, 4GB)

1 points | by yubainu 2 hours ago ago

2 comments

yubainu 2 hours ago ago
I’ve always been skeptical of the current mainstream approach to hallucination detection—using a larger, more expensive LLM to "fact-check" a smaller one after the fact. To me, this felt like an inefficient recursive loop that doesn't solve the root cause.
When a human lies, the truth often reveals itself not in their words, but in their "tells"—a subtle change in facial expression or a shift in tone. I theorized that LLMs might exhibit similar "neural tells." When a model starts to hallucinate, there should be a detectable anomaly in the Hidden State Dynamics before the token is even sampled.
This led me to develop the Sibainu Engine.
My goal was to build a pre-emptive auditing layer that runs on consumer-grade hardware (RTX 3050 4GB). By monitoring the geometric stability (which I call "Layer Dissonance") between transformer layers in real-time, the engine identifies the "collapse of latent trajectory" with a core latency of less than 1ms.
Key Technical Highlights:
Efficiency: ROC-AUC > 0.90 across Gemma, Llama-3.2, and Mistral, without any additional training or fine-tuning.
Low Overhead: While the Python API adds some serialization delay, the vectorized NumPy core is fast enough to be integrated directly into any inference pipeline without bottlenecking generation.
Autonomous Recovery: I've included a demo where the engine aborts a "corrupted" session and triggers a deterministic re-generation the moment a physical neural anomaly is detected.
I believe that for LLM safety to be truly scalable, it needs to be lightweight and deterministic. I’m curious to hear your thoughts on this geometric approach and its potential generalizability to larger architectures.
yubainu an hour ago ago
[dead]