Scaling Laws, Honestly

(completeskeptic.com)

4 points | by CompleteSkeptic 11 hours ago ago

2 comments

  • wardc 9 hours ago ago

    Pretty interesting article it seems reasonable and vibes with what kinds of models people are releasing in the open source world.

    For chincilla / scaling laws doesnt it seem a bit weird that they arent using wall-clock? Like FA4 backwards is bandwidth bound not flops bound. it seems like you'd care about like dollars or time in relation to loss or something like that not just clean room flops. MFUs are likely not equivalent given different model sizes / shapes

  • adamzwasserman 11 hours ago ago

    [flagged]