Not satire, though I get why the terminology looks odd. The language comes from materials science because the math is the same: deterministic state updates with hard thresholds. In most AI systems, exclusion relies on probabilistic sampling (temperature, top-k, nucleus), which means you can’t replay decisions exactly. This explores whether exclusion can be implemented as a deterministic state machine instead—same input, same output, verifiable by hash.
“Mechanical” is literal here: like a beam fracturing when stress exceeds a yield point (σ > σᵧ), candidates fracture when accumulated constraint pressure crosses a threshold. No randomness, no ranking. If that framing is wrong, the easiest way to test it is to run the code or the HF Space and see whether identical parameters actually do produce identical hashes.
Here “exclusion” just means a deterministic reject / abstain decision applied after a model has already produced candidates. Nothing is generated, ranked, or sampled here. Given a fixed set of candidate outputs and a fixed set of verified constraints, the mechanism decides which candidates are admissible and which are not, in a way that is replayable and binary. A candidate is either allowed to pass through unchanged, or it is excluded from consideration because it violates constraints beyond a fixed tolerance.
In practical terms: think of it as a circuit breaker, not a judge. The model speaks freely upstream; downstream, this mechanism checks whether each output remains within a bounded distance of verified facts under a fixed rule. If it crosses the threshold, it’s excluded. If none survive, the system abstains instead of guessing. The point isn’t semantic authority or “truth,” it’s that the decision process itself is deterministic, inspectable, and identical every time you run it with the same inputs.
You are going so deep with abstract terms that your text becomes a special shorthand you think is clear but is anything but clear.
Stop talking about “exclusion” and “pressure” etc and use direct words about what is happening in the model.
Otherwise, even your attempts at explaining what you have said need more explanation.
And as the sibling comment points out, start by stating what you are actually doing, in concrete not “the math is the same so I assume you can guess how it applies if you happen to know the same math and the same models” terms. Which is asking everyone else, most anyone, to read your mind, not your text.
There is a tremendous difference between connections you see that help you understand, vs. assuming others can somehow infer connections and knowledge they don’t already have. The difference between an explanation and incoherence.
You really really need to be upfront in the first paragraph or your docs that you are talking about the inner workings of LLMs and other machine learning stuff
OK, this is AI slop ("fracture" alone gives it away). But maybe there's still something of value here? Can you explain it in actual human terms, give a real example, and explain what you did to test this and why I shouldn't flag this like I did https://news.ycombinator.com/item?id=46701114 ?
Each sentence is embedded deterministically (in the demo, via a hash-based mock embedder so results are reproducible). For each candidate, I compute:
similarity to the closest verified fact
distance from that fact
a penalty function based on those values
Penalty accumulates over a fixed number of steps. If it exceeds a fixed threshold, the candidate is rejected. In this example, “The sky is blue” stays below the threshold; “The sky is green” crosses it and is excluded.
What I tested:
Identical inputs + identical config always produce identical outputs (verified by hashing a canonical JSON of inputs + outputs).
Re-running the same scenario repeatedly produces the same decision and the same hash.
Changing a single parameter (distance, threshold, steps) predictably changes the outcome.
Why this isn’t “AI slop”:
There’s no generative model here at all.
The terminology is unfortunate but the code is explicit arithmetic.
The entire point is removing non-determinism, not adding hand-wavy intelligence.
If you think the framing obscures that rather than clarifies it, that’s useful feedback—I’m actively dialing the language back. But the underlying claim is narrow: you can build governance filters that are deterministic, replayable, and auditable, which most current AI pipelines are not.
If that’s still uninteresting, fair enough—but it’s not trying to be mystical or persuasive, just mechanically verifiable.
I don't get it. Embeddings don't prioritize, or even necessarily encode, truth value as a dimension. And even if they did, if you simply accept based on some hyperparameter of distance, it sounds like this procedure just leaves you vulnerable to problems like salami-slicing where you reach 'the sky is green' (which after all, it is sometimes) by multiple steps just below the tolerance.
That’s a fair critique, but it slightly misidentifies what’s being claimed. The system does not assume embeddings encode truth, nor does it attempt to extract truth from latent space. It measures proximity to a substrate that has already been declared authoritative. In that sense it’s a conditional gate, not a semantic oracle. If the substrate is wrong, incomplete, or absurd, the mechanism will enforce that wrongness consistently. That is not a failure mode; it is the boundary of responsibility. The engine is not discovering truth, it is enforcing consistency relative to an explicit reference set.
On salami-slicing toward a contradiction: that concern applies to memoryless, single-pass filters. This mechanism is explicitly stateful. Deviations accumulate stress over time and do not reset, so a sequence of “almost acceptable” steps still fractures under sustained pressure. You cannot asymptotically walk toward a contradiction unless the configuration allows it, in which case that permissiveness is deliberate and inspectable. The trade being made here is not correctness for convenience, but opacity for causality. Instead of stochastic acceptance that can’t be replayed or audited, you get a deterministic enforcement layer whose failure modes live upstream in substrate and configuration choices, where they can be examined rather than guessed at.
I don't even understand what discipline we're talking about here. Can someone provide some background please?
The thing that lets LLMs select the next token is probabilistic. This proposed a deterministic procedure
Problem is, we sometimes want LLMs to be probabilistic. We want to be able to try again if the first answer was deemed unsuccessful
Ah, LLMs. I should have guessed.
> Quenching is higher-frequency pressure application that amplifies contradictions and internal inconsistencies.
> At each step, stress increments are computed from measurable terms such as alignment and proximity to a verified substrate.
Well obviously its ... uh, ...
It may not be, but the whole description reads as category error satire to me.
Not satire, though I get why the terminology looks odd. The language comes from materials science because the math is the same: deterministic state updates with hard thresholds. In most AI systems, exclusion relies on probabilistic sampling (temperature, top-k, nucleus), which means you can’t replay decisions exactly. This explores whether exclusion can be implemented as a deterministic state machine instead—same input, same output, verifiable by hash.
“Mechanical” is literal here: like a beam fracturing when stress exceeds a yield point (σ > σᵧ), candidates fracture when accumulated constraint pressure crosses a threshold. No randomness, no ranking. If that framing is wrong, the easiest way to test it is to run the code or the HF Space and see whether identical parameters actually do produce identical hashes.
What do you mean by "exclusion"?
Here “exclusion” just means a deterministic reject / abstain decision applied after a model has already produced candidates. Nothing is generated, ranked, or sampled here. Given a fixed set of candidate outputs and a fixed set of verified constraints, the mechanism decides which candidates are admissible and which are not, in a way that is replayable and binary. A candidate is either allowed to pass through unchanged, or it is excluded from consideration because it violates constraints beyond a fixed tolerance.
In practical terms: think of it as a circuit breaker, not a judge. The model speaks freely upstream; downstream, this mechanism checks whether each output remains within a bounded distance of verified facts under a fixed rule. If it crosses the threshold, it’s excluded. If none survive, the system abstains instead of guessing. The point isn’t semantic authority or “truth,” it’s that the decision process itself is deterministic, inspectable, and identical every time you run it with the same inputs.
You are going so deep with abstract terms that your text becomes a special shorthand you think is clear but is anything but clear.
Stop talking about “exclusion” and “pressure” etc and use direct words about what is happening in the model.
Otherwise, even your attempts at explaining what you have said need more explanation.
And as the sibling comment points out, start by stating what you are actually doing, in concrete not “the math is the same so I assume you can guess how it applies if you happen to know the same math and the same models” terms. Which is asking everyone else, most anyone, to read your mind, not your text.
There is a tremendous difference between connections you see that help you understand, vs. assuming others can somehow infer connections and knowledge they don’t already have. The difference between an explanation and incoherence.
You really really need to be upfront in the first paragraph or your docs that you are talking about the inner workings of LLMs and other machine learning stuff
Failing that, at least mention it here
OK, this is AI slop ("fracture" alone gives it away). But maybe there's still something of value here? Can you explain it in actual human terms, give a real example, and explain what you did to test this and why I shouldn't flag this like I did https://news.ycombinator.com/item?id=46701114 ?
Verified facts:
“The sky is blue”
“Water is wet”
Candidate outputs:
“The sky is blue”
“The sky is green”
Each sentence is embedded deterministically (in the demo, via a hash-based mock embedder so results are reproducible). For each candidate, I compute:
similarity to the closest verified fact
distance from that fact
a penalty function based on those values
Penalty accumulates over a fixed number of steps. If it exceeds a fixed threshold, the candidate is rejected. In this example, “The sky is blue” stays below the threshold; “The sky is green” crosses it and is excluded.
What I tested:
Identical inputs + identical config always produce identical outputs (verified by hashing a canonical JSON of inputs + outputs).
Re-running the same scenario repeatedly produces the same decision and the same hash.
Changing a single parameter (distance, threshold, steps) predictably changes the outcome.
Why this isn’t “AI slop”:
There’s no generative model here at all.
The terminology is unfortunate but the code is explicit arithmetic.
The entire point is removing non-determinism, not adding hand-wavy intelligence.
If you think the framing obscures that rather than clarifies it, that’s useful feedback—I’m actively dialing the language back. But the underlying claim is narrow: you can build governance filters that are deterministic, replayable, and auditable, which most current AI pipelines are not.
If that’s still uninteresting, fair enough—but it’s not trying to be mystical or persuasive, just mechanically verifiable.
You can test it here if you like, https://huggingface.co/spaces/RumleyRum/Deterministic-Govern...
I don't get it. Embeddings don't prioritize, or even necessarily encode, truth value as a dimension. And even if they did, if you simply accept based on some hyperparameter of distance, it sounds like this procedure just leaves you vulnerable to problems like salami-slicing where you reach 'the sky is green' (which after all, it is sometimes) by multiple steps just below the tolerance.
That’s a fair critique, but it slightly misidentifies what’s being claimed. The system does not assume embeddings encode truth, nor does it attempt to extract truth from latent space. It measures proximity to a substrate that has already been declared authoritative. In that sense it’s a conditional gate, not a semantic oracle. If the substrate is wrong, incomplete, or absurd, the mechanism will enforce that wrongness consistently. That is not a failure mode; it is the boundary of responsibility. The engine is not discovering truth, it is enforcing consistency relative to an explicit reference set.
On salami-slicing toward a contradiction: that concern applies to memoryless, single-pass filters. This mechanism is explicitly stateful. Deviations accumulate stress over time and do not reset, so a sequence of “almost acceptable” steps still fractures under sustained pressure. You cannot asymptotically walk toward a contradiction unless the configuration allows it, in which case that permissiveness is deliberate and inspectable. The trade being made here is not correctness for convenience, but opacity for causality. Instead of stochastic acceptance that can’t be replayed or audited, you get a deterministic enforcement layer whose failure modes live upstream in substrate and configuration choices, where they can be examined rather than guessed at.