Calculating Percentage-Based Confidence from Similarities of Embedding Models

(sefiks.com)

1 points | by serengil 2 days ago ago

1 comments

serengil 2 days ago ago
We use cosine or Euclidean distances for embedding models to make hard classifications. But this has a big limitation: no measure of confidence and no interpretability.
Instead, building a logistic regression model can turn distances into percentage based confidence scores. This also accounted for how a small decrease in distance affects the confidence score—similar to how a derivative measures sensitivity.