2 comments

  • nopinsight 12 hours ago ago

    The paper introduces AGI-Benchmark 1.0.

    "AGI-Benchmark 1.0 is designed to assess a model’s ability to tackle intricate, multi-step reasoning problems across a diverse set of domains."

    See pp 13-14 for the list of tasks in 27 categories. It's diverse indeed.

  • AIFounder 11 hours ago ago

    [dead]