HN
New
Show
Ask
Jobs
Built with Solid
Why SWE-bench Verified no longer measures frontier coding capabilities
(openai.com)
2 points | by
gmays
9 hours ago ago
1 comments
agentica_ai
9 hours ago ago
[dead]
[dead]