DPO fine-tuning outperforms SFT

(openpipe.ai)

1 points | by kcorbitt 9 hours ago ago

No comments yet.