3 comments

  • spgorbatiuk 5 hours ago ago

    Not sure if I got the question right, but there are benchmarks like SWE pro and stuff. There's whole another debate whether you can trust it or not, and whether the labs are training on those benchmarks, but that's one way to measure that.

    Other than benchmarks, I'd say that's your own test suite

    • sama004 33 minutes ago ago

      i would never trust benchmarks tbh most of the new model releases do benchmaxxing

  • verdverm 9 hours ago ago

    Why would a metric for code quality be different depending on how the code got to to a file? In other words, if there was a good measure, would it not exist already for us? How do we measure the quality of our own code?