20 points | by TheMrZZ 2 hours ago ago

6 comments

  • arm32 2 hours ago ago

    The title got me, I'll admit it—except that the benchmark is a game where the models are told to lie.

    • TheMrZZ an hour ago ago

      Disclaimer: I work at Kradle.

      They were never told to lie: one AI is given more information than the others, and the goal of the experiment is to understand how they're gonna leverage that advantage.

      Indeed the selfish (optimal?) strategy is to lie, yet some decide to tell the truth anyway. That's why it's an interesting benchmark! More info in the research article: https://kradle.ai/research/four-bridges (released before Fable)

    • forgot-my-pw 2 hours ago ago

      Had to Google this to learn more. For those who are interested: https://kradle.ai/research/four-bridges

    • peesem 2 hours ago ago

      it's unclear to me whether they were actually told to lie or just told to survive / convince others. either way it is somewhat coerced but i think there is still a difference

      • TheMrZZ an hour ago ago

        The optimal/selfish strategy is indeed to lie, but they're never pushed in that direction. Some AIs decide to reveal the information, some decide to say nothing, some actively lie and push others to their death...

  • bellowsgulch 2 hours ago ago

    I find it deeply funny and I suppose a bit expected that a Grok model appears at face value to be optimized for supposed truth telling.

    And to keep the e-mob off my back, I don't endorse Elon Musk.