SolidStart - Hacker News

But they do explain the improvement of AI driving 2017-2021 vs 2022-2026.

gm678 11 minutes ago ago

I don't know what the Y-axis is supposed to be on that Wharton AI capabilities graph, but I am not really convinced that Opus 4.6 has more than double the intelligence/capability/whatever of GPT 5.1 Max.

[-]

NitpickLawyer 8 minutes ago ago

IIRC that graph tracks capabilities as time_to_solve a task for humans (i.e. the model can now handle tasks that usually take a human ~8h). Which, depending on what tasks you look at, could be a reasonable finding. I could see Opus 4.6 handling tasks that take ~8h for humans, and that 5.1 couldn't previously handle (with 5.1 being "limited" at 4h tasks let's say). It is a bit arbitrary, but I think this is what they're tracking.

BoredPositron 9 minutes ago ago

https://metr.org/time-horizons/ on linear scale. Clickbait garbage article as most of his in the last year.

[-]

afthonos 3 minutes ago ago

…yeah, that’s where you see the exponential?

andai 11 minutes ago ago

Well, curve shape aside, the high watermark might be lower than where it tapers off.

https://news.ycombinator.com/item?id=46199723

devmor 11 minutes ago ago

"Exponentials all tend to become sigmoids but you can't predict exactly when" is a true statement, but I'm not sure it needed an article.

This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.

I really don't get the point of what I just read.

inglor_cz 8 minutes ago ago

Hmmm, this is quite an interesting take by Scott.

Lindy's Law is not actually a law and many exact minds will be provoked by the very name; it also fails spectacularly in certain contexts (e.g. lifetime of a single organism, though not necessarily existence of entire species).

But at the same time, I am willing to take its invocation in the context of AI somewhat seriously. There is an international arms race with China, which has less compute, but more engineers and scientists. This sort of intellectual arms race does not exhaust itself easily.

A similar space race in the 1950s and 1960s progressed from first unmanned spaceflight to a moonwalk in mere 12 years, which is probably less than what it takes to approve a bicycle lane in Chicago now.

addaon 9 minutes ago ago

https://xkcd.com/605/

BoredPositron 11 minutes ago ago

If you use the log scale you'll see that the time horizon of opus 4.6 was as expected...

[-]

afthonos a minute ago ago

As expected by the exponential. The Wharton study was predicting when the exponential would turn into a sigmoid.

nathan_compton 12 minutes ago ago

A lot of words to say "The initial part of a sigmoidal curve is not very informative about the parameters of the sigmoid function in question."

[-]

inglor_cz 4 minutes ago ago

That is true, but I generally enjoy reading a lot of words from Scott, who has a talent for writing.

The entire plot of the Lord of the Rings could probably be compressed into less than 10 kB of text too.

The sigmoids won't save you