Zork-bench: An LLM reasoning eval based on text adventure games

(lowimpactfruit.com)

2 points | by nicholasjbs 2 hours ago ago

No comments yet.