> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.
I'm sure people were saying that about commercial airline speeds in the 1970's too.
But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
With LLM's at the moment, the limiting factors might turn out to be training data, cost, or inherent limits of the transformer approach and the fact that LLM's fundamentally cannot learn outside of their context window. Or a combination of all of these.
The tricky thing about S curves is, you never know where you are on them until the slowdown actually happens. Are we still only in the beginning of the growth part? Or the middle where improvement is linear rather than exponential? And then the growth starts slowing...
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
I'd argue all of them. Any true exponential eventually gets to a point where no computer can even store its numerical value. It's a physically absurd curve.
The narrative quietly assumes that this exponential curve can in fact continue since it will be the harbinger of the technological singularity. Seems more than a bit eschatological, but who knows.
If we suppose this tech rapture does happen, all bets are off; in that sense it's probably better to assume the curve is sigmoidal, since the alternative is literally beyond human comprehension.
Barring fully reversible processes as the basis for technology, you still quickly run into energy and cooling constraints. Even with that, you'd have time or energy density constraints. Unlimited exponentials are clearly unphysical.
Yes, this is an accurate description, and also completely irrelevant to the issue at hand.
At the stage of development we are today, no one cares how fast it takes for the exponent to go from eating our galaxy to eating the whole universe, or whether it'll break some energy density constraint before it and leave a gaping zero-point energy hole where our local cluster used to be.
It'll stop eventually. What we care about is whether it stops before it breaks everything for us, here on Earth. And that's not at all a given. Fundamental limits are irrelevant to us - it's like worrying that putting too many socks in a drawer will eventually make them collapse into a black hole. The limits that are relevant to us are much lower, set by technological, social and economic factors. It's much harder to say where those limits lay.
Sure, but it reminds us that we are dealing with an S-curve, so we need to ask where the inflection point is. i.e. what are the relevant constraints, and can they reasonably sustain exponential growth for a while still? At least as an outsider, it's not obvious to me whether we won't e.g. run into bandwidth or efficiency constraints that make scaling to larger models infeasible without reimagining the sorts of processors we're using. Perhaps we'll need to shift to analog computers or something to break through cooling problems, and if the machine cannot find the designs for the new paradigm it needs, it can't make those exponential self-improvements (until it matches its current performance within the new paradigm, it gets no benefit from design improvements it makes).
My experience is that "AI can write programs" is only true for the smallest tasks, and anything slightly nontrivial will leave it incapable of even getting started. It doesn't "often makes mistakes or goes in a wrong direction". I've never seen it go anywhere near the right direction for a nontrivial task.
That doesn't mean it won't have a large impact; as an autocomplete these things can be quite useful today. But when we have a more honest look at what it can do now, it's less obvious that we'll hit some kind of singularity before hitting a constraint.
Indeed you can't be sure. But on the other hand a bunch of the commentariat has been claiming (with no evidence) that we're at the midpoint of the sigmoid for the last three years. They were wrong. And then you had the AI frontier lab insiders who predicted an accelerating pace of progress for the last three years. They were right. Now, the frontier labs rarely (never?) provide evidence either, but they do have about a year of visibility into the pipeline, unlike anyone outside.
So at least my heuristic is to wait until a frontier lab starts warning about diminishing returns and slowdowns before calling the midpoint or multiple labs start winding down capex. The first component might have misaligned incentives, but if we're in a realistic danger of hitting a wall in the next year, the capex spending would not be accelerating the way it is.
Capex requirements might be on a different curve than model improvements.
E.g. you might need to accelerate spending to get sub-linear growth in model output.
If valuations depend on hitting the curves described in the article, you might see accelerating capex at precisely the time improvements are dropping off.
I don’t think frontier labs are going to be a trustworthy canary. If Anthropic says they’re reaching the limit and OpenAI holds the line that AGI is imminent, talent and funding will flee Anthropic for OpenAI. There’s a strong incentive to keep your mouth shut if things aren’t going well.
I think you nailed it. The capex is desperation in the hopes of maintaining the curve. I have heard actual AI researchers say progress is slowing, just not from the big companies directly.
There are a few other limitations, in particular how much energy, hardware and funding we (as a society) can afford to throw at the problem, as well as the societal impact.
AI development is currently given a free pass on these points, but it's very unclear how long that will last. Regardless of scientific and technological potential, I believe that we'll hit some form of limit soon.
There's a Mulla Nasrudin joke that's sort of relevant here:
Nasrudin is on a flight, when suddenly the pilot comes on the intercom, saying, "Passengers, we apologize, but we have experienced an engine burn-out. The plane can still fly on the remaining three engines, but we'll be delayed in our arrival by two hours."
Nasrudin speaks up "let's not worry, what's 2 hours really"
A few minutes later, the airplane shakes, and passengers see smoke coming out of another engine. Again, the intercom crackles to life.
"This is your captain speaking. Apologies, but due to a second engine burn-out, we'll be delayed by another two hours."
The passengers are agitated, but the Mulla once again tries to remains calm.
Suddenly, the third engine catches fire. Again, the pilot comes on the intercom and says, "I know you're all scared, but this is a very advanced aircraft, and it can safely fly on only a single engine. But we will be delayed by yet another two hours."
At this, Nasrudin shouts, "This is ridiculous! If one more engine goes, we'll be stuck up here all day"
It's obvious, but the problem was that enough people would die in the process for people to be worried. Similarly, if the current AI will be able to replace 99% of devs in 5-10 years (or even worse, most white collar jobs) and flatten out there without becoming a godlike AGI, it will still have enormous implications for the economy.
Infectious diseases rarely see actual exponential growth for logistical reasons. It's a pretty unrealistic model that ignores that the disease actually needs to find additional hosts to spread, the local availability of which starts to go down from the first victim.
Yes the model where the S curves comes out is extremely simplified. Looking at covid curves we could have well said it was parabolic, but that’s much less worrisome
If you assume the availability of hosts is local to the perimeter of the infected hosts, then the relative growth is limited to 2/R where R is the distance from patient 0 in 2 dimensions. It's becuase an area of the circle defines how many hosts are already ill but the interaction can only happen on the perimeter of the circle.
The disease is obviously also limited by the total amount of hosts, but I assume there's also the "bottom" limit - i.e. the resource consumption of already-infected hosts.
It also depends on how panicked people are. Covid was never going to spread like ebola, for instance: it was worse. Bad enough to harm and kill people, but not bad enough to scare them into self-enforced isolation and voluntary compliance with public health measures.
Back on the subject of AI, I think the flat part of the curve has always been in sight. Transformers can achieve human performance in some, even many respects, but they're like children who have to spend a million years in grade school to learn their multiplication tables. We will have to figure out why that is the case and how to improve upon it drastically before this stuff really starts to pay off. I'm sure we will but we'll be on a completely different S-shaped curve at that point.
> a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
Yes of course it’s not going to increase exponentially forever.
The point is, why predict that the growth rate is going to slow exactly now? What evidence are you going to look at?
It’s possible to make informed predictions (eg “Moore’s law can’t get you further than 1nm with silicon due to fundamental physical limits”). But most commenters aren’t basing their predictions in anything as rigorous as that.
And note, there are good reasons to predict a speedup, too; as models get more intelligent, they will be able to accelerate the R&D process. So quality per-researcher is now proportional to the exponential intelligence curve, AND quantity of researchers scales with number of GPUs (rather than population growth which is much slower).
NOTE IN ADVANCE: I'm generalizing, naturally, because talking about specifics would require an essay and I'm trying to write a comment.
Why predict that the growth rate is going to slow now? Simple. Because current models have already been trained on pretty much the entire meaningful part of the Internet. Where are they going to get more data?
The exponential growth part of the curve was largely based on being able to fit more and more training data into the models. Now that all the meaningful training data has been fed in, further growth will come from one of two things: generating training data from one LLM to feed into another one (dangerous, highly likely to lead to "down the rabbit hole forever" hallucinations, and weeding those out is a LOT of work and will therefore contribute to slower growth), or else finding better ways to tweak the models to make better use of the available training data (which will produce growth, but much slower than what "Hey, we can slurp up the entire Internet now!" was producing in terms of rate of growth).
And yes, there is more training data available because the Internet is not static: the Internet of 2025 has more meaningful, human-generated content than the Internet of 2024. But it also has a lot more AI-generated content, which will lead into the rabbit-hole problem where one AI's hallucinations get baked into the next one's training, so the extra data that can be harvested from the 2025 Internet is almost certainly going to produce slower growth in meaningful results (as opposed to hallucinated results).
Curiously, humans don't seem to require reading the entire internet in order to perform at human level on a wide variety of tasks... Nature suggests that there's a lot of headroom in algorithms for learning on existing sources. Indeed, we had models trained on the whole internet a couple years ago, now, yet model quality has continued to improve.
Meanwhile, on the hardware side, transistor counts in GPUs are in the tens of billions and still increasing steadily.
This is a great question, but note that folks were freaking out about this a year or so ago and we seem to be doing fine.
We seem to be making progress with some combination of synthetic training datasets on coding/math tasks, textbooks authored by paid experts, and new tokens (plus preference signals) generated by users of the LLM systems.
It wouldn’t surprise me if coding/math turned out to have a dense-enough loss-landscape to produce enough synthetic data to get to AGI - though I wouldn’t bet on this as a highly likely outcome.
I have been wanting to read/do some more rigorous analysis here though.
This sort of analysis would count as the kind of rigorous prediction that I’m asking for above.
I am extremely confident that AGI, if it is achievable at all (which is a different argument and one I'm not getting into right now), requires a world model / fact model / whatever terminology you prefer, and is therefore not achievable by models that simply chain words together without having any kind of understanding baked into the model. In other words, LLMs cannot lead to AGI.
I disagree that generic LLMs plus CoT/reasoning/tool calling (ie the current stack) cannot in principle implement a world model.
I believe LLMs are doing some sort of world modeling and likely are mostly lacking a medium-/long-term memory system in which to store it.
(I wouldn’t be surprised if one or two more architectural overhauls end up occurring before AGI, I also wouldn’t be surprised if these occurred seamlessly with our current trajectory of progress)
Alternative argument, there is no need for more training data, just better algorithms. Throwing more tokens at the problem doesn't solve the fact that training llms using supervised learning is a poor way to integrate knowledge. We have however seen promising results coming out of reinforcement learning and self play. Which means that anthropic and openais' bet on scale is likely a dead end, but we may yet see capability improvements coming from other labs, without the need for greater data collection.
Better algorithms is one of the things I meant by "better ways to tweak the models to make better use of the available training data". But that produces slower growth than the jaw-droppingly rapid growth you can get by slurping pretty much the whole Internet. That produced the sharp part of the S curve, but that part is behind us now, which is why I assert we're approaching the slower-growth part at the top of the curve.
> The point is, why predict that the growth rate is going to slow exactly now? What evidence are you going to look at?
Why predict that the (absolute) growth rate is going to keep accelerating past exactly now?
Exponential growth always assumes a constant relative growth rate, which works in the fiction of economics, but is otherwise far from an inevitability. People like to point to Moore's law ad nauseam, but other things like "the human population" or "single-core performance" keep accelerating until they start cooling off.
> And note, there are good reasons to predict a speedup, too; as models get more intelligent, they will be able to accelerate the R&D process.
And if heaven forbid, R&D ever turns out to start taking more work for the same marginal returns on "ability to accelerate the process", then you no longer have an exponential curve. Or for that matter, even if some parts can be accelerated to an amazing extent, other parts may get strung up on Amdahl's law.
It's fine to predict continued growth, and it's even fine to predict that a true inflection point won't come any time soon, but exponential growth is something else entirely.
> Why predict that the (absolute) growth rate is going to keep accelerating past exactly now?
By following this logic you should have predicted Moore’s law would halt every year for the last five decades. I hope you see why this is a flawed argument. You prove too much.
But I will answer your “why”: plenty of exponential curves exist in reality, and empirically, they can last for a long time. This is just how technology works; some exponential process kicks off, then eventually is rate-limited, then if we are lucky another S-curve stacks on top of it, and the process repeats for a while.
Reality has inertia. My hunch is you should apply some heuristic like “the longer a curve has existed, the longer you should bet it will persist”. So I wouldn’t bet on exponential growth in AI capabilities for the next 10 years, but I would consider it very foolish to use pure induction to bet on growth stopping within 1 year.
And to be clear, I think these heuristics are weak and should be trumped by actual physical models of rate-limiters where available.
> By following this logic you should have predicted Moore’s law would halt every year for the last five decades. I hope you see why this is a flawed argument. You prove too much.
I do think it's continually amazing that Moore's law has continued in some capacity for decades. But before trumpeting the age of exponential growth, I'd love to see plenty of examples that aren't named "Moore's law": as it stands, one easy hypothesis is that "ability to cram transistors into mass-produced boards" lends itself particularly well to newly-discovered strategies.
> So I wouldn’t bet on exponential growth in AI capabilities for the next 10 years, but I would consider it very foolish to use pure induction to bet on growth stopping within 1 year.
Great, we both agree that it's foolish to bet on growth stopping within 1 year. What I'm saying that "growth doesn't stop" ≠ "growth is exponential".
A theory of "inertia" could just as well support linear growth: it's only because we stare at relative growth rates that we treat exponential growth as a "constant" that will continue in the absence of explicit barriers.
This is where I’d really like to be able to point to our respective Manifold predictions on the subject; we could circle back in a year’s time and review who was in fact correct. I wager internet points it will be me :)
Solar panel cost per watt has been dropping exponentially for decades as well...
Partly these are matters of economies of scale - reduction in production costs at scale - and partly it's a matter of increasing human attention leading to steady improvements as the technology itself becomes more ubiquitous.
> why predict that the growth rate is going to slow exactly now?
why predict that it will continue? Nobody ever actually makes an argument that growth is likely to continue, they just extrapolate from existing trends and make a guess, with no consideration of the underlying mechanics.
Oh, go on then, I'll give a reason: this bubble is inflated primarily by venture capital, and is not profitable. The venture capital is starting to run out, and there is no convincing evidence that the businesses will become profitable.
Progress in information systems cannot be compared to progress in physical systems.
For starters, physical systems compete for limited resources and labor.
For another, progress in software vastly reduces the cost of improved designs. Whereas progress in physical systems can enable but still increase the cost of improved designs.
Finally, the underlying substrate of software is digital hardware, which has been improving in both capabilities and economics exponentially for almost 100 years.
Looking at information systems as far back as the first coordination of differentiating cells to human civilization is one of exponential improvement. Very slow, slow, fast, very fast. (Can even take this further, to first metabolic cycles, cells, multi-purpose genes, modular development genes, etc. Life is the reproduction of physical systems via information systems.)
Same with human technological information systems, from cave painting, writing, printing, telegraph, phone, internet, etc.
It would be VERY surprising if AI somehow managed to fall off the exponential information system growth path. Not industry level surprising, but "everything we know about how useful information compounds" level surprising.
> Looking at information systems as far back as the first coordination of differentiating cells to human civilization is one of exponential improvement.
Under what metric? Most of the things you mention don't have numerical values to plot on a curve. It's a vibe exponential, at best.
Life and humans have become better and better at extracting available resources and energy, but there's a clear limit to that (100%) and the distribution of these things in the universe is a given, not something we control. You don't run information systems off empty space.
Life has been on Earth about 3.5-3.8 billion years.
Break that into 0.5-0.8, 1 billion, 1 billion, 1 billion "quarters", and you will find exponential increases in evolutions rate of change and production of diversity across them by many many objective measures.
Now break up the last 1 billion into 100 million year segments. Again exponential.
Then break up the last 100 million into segments. Again.
Then the last 10 million years into segments, and watch humans progress.
The last million, in 100k year segments, watch modern humans appear.
the last 10k years into segments, watch agriculture, civilizations, technology, writing ...
The last 1000 years, incredible aggregation of technology, math, and the appearance of formal science
last 100 years, gets crazy. Information systems appear in labs, then become ubiquitous.
last 10 years, major changes, AI starts having mainstream impact
last 1 year - even the basic improvements to AI models in the last 12 months are an unprecedented level of change, per time, looking back.
I am not sure how any of could appear "vibe", given any historical and situational awareness.
This progression is universally recognized. Aside from creationists and similar contingents.
I am curious when you think we will run out of atoms to make information systems.
How many billions of years you think that might take.
Of all the things to be limited by, that doesn't seem like a near term issue. Just an asteroid or two alone will provide resources beyond our dreams. And space travel is improving at a very rapid rate.
In the meantime, in terms of efficiency of using Earth atoms for information processing, there is still a lot space at the "bottom", as Feynman said. Our crude systems are limited today by their power waste. Small energy efficient systems, and more efficient heat shedding, will enable full 3D chips ("cubes"?) and vastly higher density of packing those.
The known limit for information for physical systems per gram, is astronomical:
• Bremermann’s limit : 10^47 operations per second, per gram.
Other interesting limits:
• Margolus–Levitin bound - on quantum state evolution
• Landauer’s principle - Thermodynamic cost of erasing (overwriting) one bit.
• Bekenstein bound: Maximum storage by volume.
Life will go through many many singularities before we get anywhere near hard limits.
>[..] to first metabolic cycles, cells, multi-purpose genes, modular development genes, etc.
One example is when cells discovered energy production using mitochondria. Mitochondria add new capabilities to the cell, with (almost) no downside like: weight, temperature-sensitivity, pressure-sensitivity. It's almost 100% upside.
If someone tried to predict the future number of mitochondria-enabled cells from the first one, he could be off by 10^20 less cells.
I am writing a story the last 20 days, with that exact story plot, have to get my stuff together and finish it.
By physical systems, I meant systems whose purpose is to do physical work. Mechanical things. Gears. Struts.
Computer hardware is an information system. You are correct that it is has a physical component. But its power comes from its organization (information) not its mass, weight, etc.
Transistors get more powerful, not less, when made from less matter.
Information systems move from substrate to more efficient substrate. They are not their substrate.
They still depend on physical resources and labor. They’re made by people and machines. There’s never been more resources going into information systems than right now, and AI accelerated that greatly. Think of all the server farms being built next to power plants.
That's fallacious reasoning, you are extrapolating from survivorship bias. A lot of technologies, genes, or species have failed along the way. You are also subjectively attributing progression as improvements, which is problematic as well, if you speak about general trends. Evolution selects for adaptation not innovation. We use the theory of evolution to explain the emergence of complexity, but that's not the sole direction and there are many examples where species evolved towards simplicity (again).
Resource expense alone could be the end of AI. You may look up historic island populations, where technological demands (e.g. timber) usually led to extinction by resource exhaustion and consequent ecosystem collapse (e.g. deforestation leading to soil erosion).
Doesn't answer the core fallacy. Historical "technological progress" can't be used as argument for any particular technology. Right now, if we are talking about AI, we're talking about specific technologies, which may just as well fail and remain inconsequential in the grand scheme of things, like most technologies, most things really, did in the past. Even more so since we don't understand much anything in either human or artificial cognition. Again and again, we've been wrong about predicting the limits and challenges in computation.
You see, your argument is just bad. You are merely guessing like everyone else.
Information technology does not operate by the rules of any other technology. It is a technology of math and organization, not particular materials.
The unique value of information technology is that it compounds the value of other information and technology, including its own, and lowers the bar for its own further progress.
And we know with absolute certainty we have barely scratched the computing capacity of matter. Bremermann’s limit : 10^47 operations per second, per gram. See my other comment for other relevant limits.
Do you also expect a wall in mathematics?
And yes, an unbroken historical record of 4.5 billions years of information systems becoming more sophisticated with an exponential speed increase over time, is in fact a very strong argument. Changes that took a billion years initially, now happen in very short times in today's evolution, and essentially instantly in technological time. The path is long, with significant acceleration milestones at whatever scale of time you want to look at.
Your argument, on the other hand, is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.
Substantive negative arguments about AI progress have been made. See "Perceptrons" by Marvin Minksy and Seymour Papert, for an example of what a solid negative argument looks like. It delivered insights. It made some sense at the time.
> Your argument, on the other hand, is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.
> Historical "technological progress" can't be used as argument for any particular technology.
Historical for billions of years of natural information system evolution. Metabolic, RNA, DNA, protein networks, epigenetic, intracellular, intercellular, active membrane, nerve precursors, peptides, hormonal, neural, ganglion, nerve nets, brains.
Thousands of years of human information systems. Hundreds of years of technological information systems. Decades of digital information systems. Now in in just the last few years, progress year to year is unlike any seen before.
Significant innovations being reported virtually every day.
Yes track records carry weight. Especially with no good reason for any reason for a break, while every tangible reason to believe nothing is slowing down, right up to today.
"Past is not a predictor of future behavior" is about asset gains relative to asset prices in markets where predictable gains have had their profitability removed by the predictive pricing of others. A highly specific feedback situation making predicting asset gains less predictable even when companies do maintain strong predictable trends in fundamentals.
It is a narrow specific second order effect.
It is the worst possible argument for anything outside of those special conditions.
Every single thing you have ever learned was predicated on the past having strong predictive qualities.
You should understand what an argument means, before throwing it into contexts where its preconditions don't exist.
> Right now, if we are talking about AI, we're talking about specific technologies, which may just as well fail and remain inconsequential in the grand scheme of things, like most technologies, most things really, did in the past. Even more so since we don't understand much anything in either human or artificial cognition. Again and again, we've been wrong about predicting the limits and challenges in computation.
> Your argument [...] is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.
If I need to be clearer, nobody could know when you wrote that by reading it. It isn't an argument it's a free floating opinion. And you have not made it more relevant today, than it would have been all the decades up till now, through all the technological transitions up until now. Your opinion was equally "applicable", and no less wrong.
This is what "Zero new insight. Zero predictive capacity" refers to.
> Substantive negative arguments about AI progress have been made. See "Perceptrons" by Marvin Minksy and Seymour Papert, for an example of what a solid negative argument looks like. It delivered insights. It made some sense at the time.
The cost of the next number in a GPT (3>4>5) seems to be in 2 ways:
1) $$$
2) data
The second (data) also isn't cheap. As it seems we've already gotten through all the 'cheap' data out there. So much so that synthetic data (fart huffing) is a big thing now. People tell it's real and useful and passes the glenn-horf theore... blah blah blah.
So it really more so comes down to just:
1) $$$^2 (but really pick any exponent)
In that, I'm not sure this thing is a true sigmoid curve (see: biology all the time). I think it's more a logarithmic cost here. In that, it never really goes away, but it gets really expensive to carry out for large N.
[To be clear, lots of great shit happens out there in large N. An AI god still may lurk in the long slow slope of $N, the cure for boredom too, or knowing why we yawn, etc.]
I am getting the sense that the 2nd deriative of the curve is already hitting negative teritory. models get updated, and I don't feel I'm getting better answers from the LLMs.
On the application front though, it feels that the advancements from a couple of years ago are just beginning to trickle down to product space. I used to do some video editing as a hobby. Recently I picked it up again, and was blown away by how much AI has chipped away the repetitive stuff, and even made attempts at the more creative aspects of production, with mixed but promising results.
one example is auto generating subtitles -- elements of this tasks, e.g. speech to text with time coding, have been around for a while (openai whisper and others), but they have only recently been integrated into video editors and become easy to use for non-coders.
other examples: depth map (estimating object distance from the camera; this is useful when you want to blur the background), auto-generating masks with object tracking.
Yes. It's true that we don't know, with any certainty, (1) whether we are hitting limits to growth intrinsic to current hardware and software, (2) whether we will need new hardware or software breakthroughs to continue improving models, and (3) what the timing of any necessary breakthroughs, because innovation doesn't happen on a predictable schedule. There are unknown unknowns.[a]
However, there's no doubt that at a global scale, we're sure trying to maintain current rates of improvement in AI. I mean, the scale and breadth of global investment dedicated to improving AI, presently, is truly unprecedented. Whether all this investment is driven by FOMO or by foresight, is irrelevant. The underlying assumption in all cases is the same: We will figure out, somehow, how to overcome all known and unknown challenges along the way. I have no idea what the odds of success may be, but they're not zero. We sure live in interesting times!
Each specific technology can be S-shaped, but advancements in achieving goals can still maintain an exponential curve. e.g. Moore's law is dead with the end of Dennard scaling, but computation improvements still happen with parallelism.
Meta's Behemoth shows that scaling number of parameters has diminished returns, but we still have many different ways to continue advancements. Those who point at one thing and say "see", isn't really seeing. Of course there are limits, like energy but with nuclear energy or photon-based computing were nowhere near the limits.
And, maybe I'm missing something, but to me it seems obvious that flat top part of the S curve is going to be somewhere below human ability... because, as you say, of the training data. How on earth could we train an LLM to be smarter than us, when 100% of the material we use to teach it how to think, is human-style thinking?
Maybe if we do a good job, only a little bit below human ability -- and what an accomplishment that would still be!
But still -- that's a far cry from the ideas espoused in articles like this, where AI is just one or two years away from overtaking us.
The standard way to do this is Reinforcement Learning: we do not teach the model how to do the task, we let it discover the _how_ for itself and only grade it based on how well it did, then reinforce the attempts where it did well. This way the model can learn wildly superhuman performance, e.g. it's what we used to train AlphaGo and AlphaZero.
> I'm sure people were saying that about commercial airline speeds in the 1970's too.
Or CPU frequencies in the 1990's. Also we spent quite a few decades at the end of the 19th century thinking that physics was finished.
I'm not sure that explaining it as an "S curve" is really the right metaphor either, though.
You get the "exponential" growth effect when there's a specific technology invented that "just needs to be applied", and the application tricks tend to fall out quickly. For sure generative AI is on that curve right now, with everyone big enough to afford a datacenter training models like there's no tomorrow and feeding a community of a million startups trying to deploy those models.
But nothing about this is modeled correctly as an "exponential", except in the somewhat trivial sense of "the community of innovators grows like a disease as everyone hops on board". Sure, the petri dish ends up saturated pretty quickly and growth levels off, but that's not really saying much about the problem.
> I'm sure people were saying that about commercial airline speeds in the 1970's too.
They'd be wrong, of course - for not realizing demand is a limiting factor here. Airline speeds plateaued not because we couldn't make planes go faster anymore, but because no one wanted them to go faster.
This is partially economical and partially social factor - transit times are bucketed by what they enable people to do. It makes little difference if going from London to New York takes 8 hours instead of 12 - it's still in the "multi-day business trip" bucket (even 6 hours goes into that bucket, once you add airport overhead). Now, if you could drop that to 3 hours, like Concorde did[0], that finally moves it into "hop over for a meet, fly back the same day" bucket, and then business customers start paying attention[1].
For various technical, legal and social reasons, we didn't manage to cross that chasm before money for R&D dried out. Still, the trend continued anyway - in military aviation and, later, in supersonic missiles.
With AI, the demand is extreme and only growing, and it shows no sign of being structured into classes with large thresholds between them - in fact, models are improving faster than we're able to put them to any use; even if we suddenly hit a limit now and couldn't train even better models anymore, we have decades of improvements to extract just from learning how to properly apply the models we have. But there's no sign we're about to hit a wall with training any time soon.
Airline speeds are inherently a bad example for the argument you're making, but in general, I don't think pointing out S-curves is all that useful. As you correctly observe:
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
But, what happens when one technology - or rather, one metric of that technology - stops improving? Something else starts - another metric of that technology, or something built on top of it, or something that was enabled by it. The exponent is S-curves on top of S-curves, all the way down, but how long that exponent is depends on what you consider in scope. So, a matter of accounting. So yeah, AI progress can flatten tomorrow or continue exponentially for the next couple years - depending on how narrowly you define "AI progress".
[1] - This is why Elon Musk wasn't immediately laughed out of the room after proposing using Starship for moving people and cargo across the Earth, back in 2017. Hopping between cities on an ICBM sounds borderline absurd for many reasons, but it also promised cutting flight time to less than one hour between any two points on Earth, which put it a completely new bucket, even more interesting for businesses.
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should remain optimistic and milk that while we can lest pessimism cut off investment top early.
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should milk that while we can.
"I'm sure people were saying that about commercial airline speeds in the 1970's too."
But there are others that keep going also. Moore's law is still going (mostly, slowing), and made it past a few pinch points where people thought it was the end.
The point is, that over 30 decades, many people said Moore's law was at an end, and then it wasn't, there was some breakthrough that kept it going. Maybe a new one will happen.
The thing with AI is, maybe the S curve flattens out , after all the jobs are gone.
Everyone is hoping the S curve flattens out somewhere just below human level, but what if it flattens out just beyond human level? We're still screwed.
There’s a key way to think about a process that looks exponential and might or might not flatten out into an S curve: reasoning about fundamental limits. For COVID it would obviously flatten out because there are finite humans, and it did when the disease had in fact infected most humans on the planet. For commercial airlines you could reason about the speed of sound or escape velocity and see there is again a natural upper limit- although which of those two would dominate would have very different real world implications.
For computational intelligence, we have one clear example of an upper limit in a biological human brain. It only consumes about 25W and has much more intelligence than today’s LLMs in important ways. Maybe that’s the wrong limit? But Moore’s law has been holding for a very long time. And smart physicists like Feynman in his seminal lecture predicting nanotechnology in 1959 called “there’s plenty of room at the bottom” have been arguing that we are extremely far from running into any fundamental physical limits on the complexity of manufactured objects. The ability to manufacture them we presume is limited by ingenuity, which jokes aside shows no signs of running out.
Training data is a fine argument to consider. Especially since there are training on “the whole internet” sorta. The key breakthrough of transformers wasn’t in fact autoregressive token processing or attention or anything like that. It was that they can learn from (memorize / interpolate between / generalize) arbitrary quantities of training data. Before that every kind of ML model hit scaling limits pretty fast. Resnets got CNNs to millions of parameters but they still became quite difficult to train. Transformers train reliably on every size data set we have ever tried with no end in sight. The attention mechanism shortens the gradient path for extremely large numbers of parameters, completely changing the rules of what’s possible with large networks. But what about the data to feed them?
There are two possible counter arguments there. One is that humans don’t need exabytes of examples to learn the world. You might reasonably conclude from this that NNs have some fundamental difference vs people and that some hard barrier of ML science innovation lies in the way. Smart scientists like Yann LeCun would agree with you there. I can see the other side of that argument too - that once a system is capable of reasoning and learning it doesn’t need exhaustive examples to learn to generalize. I would argue that RL reasoning systems like GRPO or GSPO do exactly this - they let the system try lots of ways to approach a difficult problem until they figure out something that works. And then they cleverly find a gradient towards whatever technique had relative advantage. They don’t need infinite examples of the right answer. They just need a well chosen curriculum of difficult problems to think about for a long time. (Sounds a lot like school.) Sometimes it takes a very long time. But if you can set it up correctly it’s fairly automatic and isn’t limited by training data.
The other argument is what the Silicon Valley types call “self play” - the goal of having an LLM learn from itself or its peers through repeated games or thought experiments. This is how Alpha Go was trained, and big tech has been aggressively pursuing analogs for LLMs. This has not been a runaway success yet. But in the area of coding agents, arguably where AI is having the biggest economic impact right now, self play techniques are an important part of building both the training and evaluation sets. Important public benchmarks here start from human curated examples and algorithmically enhance them to much larger sizes and levels of complexity. I think I might have read about similar tricks in math problems but I’m not sure. Regardless it seems very likely that this has a way to overcome any fundamental limit on availability of training data as well, based on human ingenuity instead.
Also, if the top of the S curve is high enough, it doesn’t matter that it’s not truly exponential. The interesting stuff will happen before it flattens out. E.g. COVID. Consider the y axis “human jobs replaced by AI” instead of “smartness” and yes it’s obviously an S curve.
> For computational intelligence, we have one clear example of an upper limit in a biological human brain. It only consumes about 25W and has much more intelligence than today’s LLMs in important ways. Maybe that’s the wrong limit?
It's a good reference point, but I see no reason for it to be an upper limit - by the very nature of how biological evolution works, human brains are close to the worst possible brains advanced enough to start a technological revolution. We're the first brain on Earth that crossed that threshold, and in evolutionary timescales, all that followed - all human history - happened in an instant. Evolution didn't have time yet to iterate on our brain design.
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should remain optimistic and milk that while we can lest pessimism cut off investment too early.
Just because something exhibits an exponential growth at one point in time, that doesn’t mean that a particular subject is capable of sustaining exponential growth.
Their Covid example is a great counter argument to their point in that covid isn’t still growing exponentially.
Where the AI skeptics (or even just pragmatists, like myself) chime in is saying “yeah AI will improve. But LLMs are a limited technology that cannot fully bridge the gap between what they’re producing now, and what the “hypists” claim they’ll be able to do in the future.”
People like Sam Altman know ChatGPT is a million miles away from AGI. But their primary goal is to make money. So they have to convince VCs that their technology has a longer period of exponential growth than what it actually will have.
The argument is not that it will keep growing exponentially forever (obviously that is physically impossible), rather that:
- given a sustained history of growth along a very predictable trajectory, the highest likelihood short term scenario is continued growth along the same trajectory. Sample a random point on an s-curve and look slightly to the right, what’s the most common direction the curve continues?
- exponential progress is very hard to visualize and see, it may appear to hardly make any progress while far away from human capabilities, then move from just below to far above human very quickly
My point is that the limits of LLMs will be hit long before we they start to take on human capabilities.
The problem isn’t that exponential growth is hard to visualise. The problem is that LLMs, as advanced and useful a technique as it is, isn’t suited for AGI and thus will never get us even remotely to the stage of AGI.
The human like capabilities are really just smoke and mirrors.
It’s like when people anthropomorphisise their car; “she’s being temperamental today”. Except we know the car is not intelligence and it’s just a mechanical problem. Whereas it’s in the AI tech firms best interest to upsell the human-like characteristics of LLMs because that’s how they get VC money. And as we know, building and running models isn’t cheap.
There is no particular reason why AI has to stick to language models though. Indeed if you want human like thinking you pretty much have to go beyond language as we do other stuff too if you see what I mean. A recent example: "Google DeepMind unveils its first “thinking” robotics AI" https://arstechnica.com/google/2025/09/google-deepmind-unvei...
> There is no particular reason why AI has to stick to language models though.
There’s no reason at all. But that’s not the technology that’s in the consumer space, growing exponentially, gaining all the current hype.
So at this point in time, it’s just a theoretical future that will happen inevitably but we don’t know when. It could be next year. It could be 10 years. It could be 100 years or more.
My prediction is that current AI tech plateaus long before any AGI-capable technology emerges.
That feels like you're moving the goal posts a bit.
Exponential growth over the short term is very uninteresting. Exponential growth is exciting when it can compound.
E.g. if i offered you an investing opportunity 500% / per year compounded daily - that's amazing. If the fine print is that that rate will only last for the very near term (say a week), then it would be worse than a savings account.
Well, growth has been on this exponential already for 5+ years (for the METR eval), and we are at the point where models are very close to matching human expert capabilities in many domains - only one or two more years of growth would put us well beyond that point.
Personally I think we'll see way more growth than that, but to see profound impacts on our economy you only need to believe the much more conservative assumption of a little extra growth along the same trend.
> we are at the point where models are very close to matching human expert capabilities in many domains
That's a bold claim. I don't think it matches most people's experiences.
If that was really true people wouldn't be talking about exponential growth. You don't need exponential growth if you are already almost at your destination.
What I’ve seen is that LLMs are very good at simulating an extremely well read junior.
Models know all the tricks but not when to use them.
And because of that, you’re continually have to hand hold them.
Working with an LLM is really closer to pair programming than it is handing a piece of work to an expert.
The stuff I’ve seen in computer vision is far more impressive in terms of putting people out of a job. But even there, it’s still highly specific models left to churn away at tasks that are ostensibly just long and laborious tasks. Which so much of VFX is.
> we are at the point where models are very close to matching human expert capabilities in many domains
This is not true because experts in these domains don't make the same routine errors LLMs do. You may point to broad benchmarks to prove your point, but actual experts in the benchmarked fields can point to numerous examples of purportedly "expert" LLMs making things up in a way no expert would ever.
Expertise is supposed to mean something -- it's supposed to describe both a level of competency and trustworthiness. Until they can be trusted, calling LLMs experts in anything degrades the meaning of expertise.
The most common part of the S-curve by far is the flat bit before and the flat bit after. We just don't graph it because it's boring. Besides which there is no reason at all to assume that this process will follow that shape. Seems like guesswork backed up by hand waving.
> Just because something exhibits an exponential growth at one point in time, that doesn’t mean that a particular subject is capable of sustaining exponential growth.
Which is pretty ironic given the title of the post
I am constantly astonished that articles like this even pass the smell test. It is not rational to predict exponential growth just because you've seen exponential growth before! Incidentally, that is not what people did during COVID, they predicted exponential growth for reasons. Specific, articulable reasons, that consisted of more than just "look, like go up. line go up more?".
Incidentally, the benchmarks quoted are extremely dubious. They do not even really make sense. "The length of tasks AI can do is doubling every 7 months". Seriously, what does that mean? If the AI suddenly took double the time to answer the same question, that would not be progress. Indeed, that isn't what they did, they just... picked some times at random? You might counter that these are actually human completion times, but then why are we comparing such distinct and unrelated tasks as "count words in a passage" (trivial, any child can do) and "train adversarially robust image model" (expert-level task, could take anywhere between an hour and never-complete).
Honestly, the most hilarious line in the article is probably this one:
> You might object that this plot looks like it might be levelling off, but this is probably mostly an artefact of GPT-5 being very consumer-focused.
This is a plot with three points in it! You might as well be looking at tea leaves!
> but then why are we comparing such distinct and unrelated tasks as ...
Because a few years ago the LLMs could only do trivial tasks that a child could do, and now they're able to do complex research and software development tasks.
If you just have the trivial tasks, the benchmark is saturated within a year. If you just have the very complex tasks, the benchmark is has no sensitivity at all for years (just everything scoring a 0) and then abruptly becomes useful for a brief moment.
This seems pretty obvious, and I can't figure out what your actual concern is. You're just implying it is a flawed design without pointing out anything concrete.
The key word is "unrelated"! Being able to count the number of words in a paragraph and being able to train an image classifier are so different as to be unrelated for all practical purposes. The assumption underlying this kind of a "benchmark" is that all tasks have a certain attribute called complexity which is a numerical value we can use to discriminate tasks, presumably so that if you can complete tasks up to a certain "complexity" then you can complete all other tasks of lower complexity. No such attribute exists! I am sure there are "4 hour" tasks an LLM can do and "5 second" tasks that no LLM can do.
The underlying frustration here is that there is so much latitude possible in choosing which tasks to test, which ones to present, and how to quantify "success" that the metrics given are completely meaningless, and do not help anyone to make a prediction. I would bet my entire life savings that by the time the hype bubble bursts, we will still have 10 brainless articles per day coming out saying AGI is round the corner.
"It’s Difficult to Make Predictions, Especially About the Future" - Yogi Berra. It's funny because it's true.
So if you want to try to do this difficult task, because say there's billions of dollars and millions of people's livelihoods on the line, how do you do it? Gather a bunch of data, and see if there's some trend? Then maybe it makes sense to extrapolate. Seems pretty reasonable to me. Definitely passes the sniff test. Not sure why you think "line go up more" is such a stupid concept.
As they say, every exponential is a sigmoid in disguise. I think the exponential phase of growth for LLM architectures is drawing to a close, and fundamentally new architectures will be necessary for meaningful advances.
I'm also not convinced by the graphs in this article. OpenAI is notoriously deceptive with their graphs, and as Gary Marcus has already noted, that METR study comes with a lot of caveats: [https://garymarcus.substack.com/p/the-latest-ai-scaling-grap...]
>People notice that while AI can now write programs, design websites, etc, it still often makes mistakes or goes in a wrong direction, and then they somehow jump to the conclusion that AI will never be able to do these tasks at human levels, or will only have a minor impact. When just a few years ago, having AI do these things was complete science fiction!
Both things can be true, since they're orthogonal.
Having AI do these things was complete fiction 10 years ago. And after 5 years of LLM AI, people do start to see serious limits and stunted growth with the current LLM approaches, while also seeing that nobody has proposed another serious contended to that approach.
Similarly, going to the moon was science finction 100 years ago. And yet, we're now not only not in Mars, but 50+ years without a new moon manned landing. Same for airplanes. Science fiction in 1900. Mostly stale innovation wise for the last 30 years.
A lot of curves can fit an exponential line plot, without the progress going forward being exponential.
We would have 1 trillion transistor cpus following Moore's "exponential curve"
I agree with all your points, just wanted to say that transistor count is probably a counter example. We have been keeping with the Moore's Law more or less[1] and M3 Max, a 2023 consumer-grade CPU, has ~100B of transistors, "just" one order of magnitude away from yout 1T. I think that shows we haven't stagnated much in transistor density and the progress is just staggering!
That one order of magnitude is about 7 years behind the Moore's Law. We're still progressing but it's slower, more expensive and we hit way more walls than before.
Except it’s not been five years, it’s been at most three, since approximately no one was using LLMs prior to ChatGPT’s release, which was just under three years ago. We did have Copilot a year before that, but it was quite rudimentary.
And really, we’ve had even less than that. The first large scale reasoning model was o1, which was released 12 months ago. More useful coding agents are even newer than that. This narrative that we’ve been using these tools for many years and are now hitting a wall doesn’t match my experience at all. AI-assisted coding is way better than it was a year ago, let alone five.
>Except it’s not been five years, it’s been at most three,
Why would it be "at most" 3? We had Chat GPT commercially available as private beta API on 2020. It's only the mass public that got 3.5 3 years ago.
But those who'd do the noticing as per my argument is not just Joe Public (which could be oblivious), but people already starting in 2020, and includes people working in the space, who worked with LLM and LLM-like architectures 2-3 years before 2020.
It should be noted that the article author is an AI researcher at Anthropic and therefore benefits financially from the bubble: https://www.julian.ac/about/
> The current discourse around AI progress and a supposed “bubble” reminds me a lot of the early weeks of the Covid-19 pandemic. Long after the timing and scale of the coming global pandemic was obvious from extrapolating the exponential trends, politicians, journalists and most public commentators kept treating it as a remote possibility or a localized phenomenon.
That's not what I remember. On the contrary, I remember widespread panic. (For some reason, people thought the world was going to run out of toilet paper, which became a self-fulfilling prophesy.) Of course some people were in denial, especially some politicians, though that had everything to do with politics and nothing to do with math and science.
In any case, the public spread of infectious diseases is a relatively well understood phenomenon. I don't see the analogy with some new tech, although the public spread of hype is also a relatively well understood phenomenon.
Exponential curves don't last for long fortunately, or the universe would have turned into a quark soup. The example of COVID is especially ironic, considering it stopped being a real concern within 3 years of its advent despite the exponential growth in the early years.
Those who understand exponentials should also try to understand stock and flow.
Reminds me a bit of the "ultraviolet catastrophe".
> The ultraviolet catastrophe, also called the Rayleigh–Jeans catastrophe, was the prediction of late 19th century and early 20th century classical physics that an ideal black body at thermal equilibrium would emit an unbounded quantity of energy as wavelength decreased into the ultraviolet range.
[...]
> The phrase refers to the fact that the empirically derived Rayleigh–Jeans law, which accurately predicted experimental results at large wavelengths, failed to do so for short wavelengths.
Right. Nobody believed that the intensity would go to infinity. What they believed was that the theory was incomplete, but they didn't know how or why. And the solution required inventing a completely new theory.
Exponentials exist in their environment. Didn't Covid stop because we ran out of people to infect. Of course it can't keep going exponential, because there aren't exponential people to infect.
What is this limit on AI? It is technology, energy, something. All these things can be over-come, to keep the exponential going.
And of course, systems also break at the exponential. Maybe AI is stopped by the world economy collapsing. AI advancement would be stopped, but that is cold comfort to the humans.
Data. Think of our LLMs like bacteria in a Petri dish. When first introduced, they achieve exponential growth by rapidly consuming the dish's growth medium. Once the medium is consumed, growth slows and then stops.
The corpus of information on the Internet, produced over several decades, is the LLM's growth medium. And we're not producing new growth medium at an exponential rate.
> What is this limit on AI? It is technology, energy, something. All these things can be over-come, to keep the exponential going.
That's kind of begging the question. Obviously if all the limitations on AI can be overcome growth would be exponential. Even the biggest ai skeptic would agree. The question is, will it?
It's possible to understand both exponential and limiting behavior at the same time. I work in an office full of scientists. Our team scrammed the workplace on March 10, 2020.
To the scientists, it was intuitively obvious that the curve could not surpass 100% of the population. An exponential curve with no turning point is almost always seen as a sure sign that something is wrong with your model. But we didn't have a clue as to the actual limit, and any putative limit below 100% would need a justification, which we didn't have, or some dramatic change to the fundamental conditions, which we couldn't guess.
The typical practice is to watch the curve for any sign of a departure from exponential behavior, and then say: "I told you so." ;-)
The first change may have been social isolation. In fact that was pretty much the only arrow in our quivers. The second change was the vaccine, which changed both the infection rate and the mortality rate, dramatically.
I'm curious as to whether the consensus is that the observed behaviour of COVID waves was ever fully and satisfactorily explained - the tend to grow exponentially but then seemingly saturate at a much lower point than a naïve look at the curve might suggest?
It would probably be hard to do. The really huge factor may be easier to study, since we know where and when every vaccine dose was administered. The behavioral factors are likely to be harder to measure, and would have been masked by the larger effect of vaccination. We don't really know the extent of social isolation over geography, demographics, time, etc..
There's human behavioural factors yes, but I was kinda wondering about the virus itself, the R number seemed to fluctuate quite a bit, with waves peaking fast and early and then receding equally quickly.. I know there were some ideas around asymptomatic spread and superspreaders (both people with highly connected social graphs, and people shedding far more active virus than the median), I just wondered whether anyone had built a model that was considered to have accurately reproduced the observed behaviour of number of positive tests and symptomatic cases, and the way waves would seemingly saturate after infecting a few % of the population.
Long COVID is still a thing, the nAbs immunity is pretty paltry because the virus keeps changing its immunity profile so much. T-cells help but also damage the host because of how COVID overstimulates them. A big reason people aren't dying like they used to is because of the government's strategy of constant infection which boosts immunity regularly* while damaging people each time, that plus how Omicron changed SARS-CoV-2's cell entry mechanism to avoid cell-cell fusion (syncytia) that caused huge over-reaction in lung tissue.
> By the end of 2027, models will frequently outperform experts on many tasks.
In passing the quiz-es
> Models will be able to autonomously work for full days (8 working hours) by mid-2026.
Who will carry responsibility for the consequences of these model's errors? What tools will be avaiable to that resposible _person_?
--
Tehchno optimists will be optimistic. Techno pessimists will be pessimistic.
Processes we're discussing have their own limiting factors which no one mentiones. Why to mention what exactly makes graph go up and holds it from going exponential? Why to mention or discuss inherit limitations of the LLMs architecture? Or what is legal perspective on AI agency?
Thus we're discussing results of AI models passing tests and people's perception of other people opinions.
You don't actually need to have a "responsible person"; you can just have an AI do stuff. It might make a mistake; the only difference between that and an employee is that you can't punish an AI. If you're any good at management and not a psychopath, the ability to have someone to punish for mistakes isn't actually important
The importance of having a human be responsible is about alignment. We have a fundamental belief that human beings are comprehensible and have goals that are not completely opaque. That is not true of any piece of software. In the case of deterministic software, you can’t argue with a bug. It doesn’t matter how many times you tell it that no, that’s not what either the company or the user intended, the result will be the same.
With an AI, the problem is more subtle. The AI may absolutely be able to understand what you’re saying, and may not care at all, because its goals are not your goals, and you can’t tell what its goals are. Having a human be responsible bypasses that. The point is not to punish the AI, the point is to have a hope to stop it from doing things that are harmful.
I will worry when I see Startups competing on products with companies 10x, 100x, or 1000x times their size. Like a small team producing a Photoshop replacement. So far I haven't seen anything like that. Big companies don't seem to be launching new products faster either, or fixing some of their products that have been broken for a long time (MS teams...)
AI obviously makes some easy things much faster, maybe helps with boilerplate, we still have to see this translate into real productivity.
I think the real turning point is when there isn’t the need for something like photoshop. Creatives that I speak to yearn for the day when they can stop paying the adobe tax.
It's interesting that he brings up the example of "exponential" growth in the case of COVID infections even though it was actually logistic growth[1] that saturates once resources get exhausted. What makes AI different?
You'd think that boosters for a technology whose very foundations rely on the sigmoid and tanh functions used as neuron activation functions would intuitively get this...
When people want a smooth function so they can do calculus they often use something like gelu or the swish function rather than relu. And the swish function involves a sigmoid. https://en.wikipedia.org/wiki/Swish_function
This reminds me -- very tenuously -- of how the shorthand for very good performance in the Python community is "like C". In the C community, we know that programs have different performance depending on algorithms chosen..
> In the C community, we know that programs have different performance depending on algorithms chosen..
Yes. Only the C community knows this. What a silly remark.
Regarding the "Python community" remark, benchmarks against C and Fortran go back decades now. It's not just a Python thing. C people push it a lot, too.
Nah, that part is ok. Human wherever you set it, human competence takes decades to really change, and those things have visible changes ever year or so.
The problem with all of the article's metrics is that they are all absolutely bullshit. It just throws claims like that AI can write full programs 50% of the time by itself in there and moves on like if it had any resemblance to what happens on the real world.
"Models will be able to autonomously work for full days (8 working hours)" does not make them equivalent to a human employee. My employees go home and come back retaining context from the previous day; they get smarter every month. With Claude Code I have to reset the context between bite-sized tasks.
To replace humans in my workplace, LLMs need some equivalent of neuroplasticity. Maybe it's possible, but it would require some sort of shift in the approach that may or may not be coming.
Maybe when we get updating models. Right now, they are trained, and released, and we are using that static model with a context window. At some point when we have enough processing to have models that are always updating, then that would be plastic. I'm supposing.
> When just a few years ago, having AI do these things was complete science fiction!
This is only because these projects only became consumer facing fairly recently. There was a lot of incremental progress in the academic language model space leading up to this. It wasn't as sudden as this makes it sound.
The deeper issue is that this future-looking analysis goes no deeper than drawing a line connecting a few points. COVID is a really interesting comparison, because in epidemiology the exponential model comes from us understanding disease transmission. It is also not actually exponential, as the population becomes saturated the transmission rate slows (it is worth noting that unbounded exponential growth doesn't really seem to exist in nature). Drawing an exponential line like this doesn't really add anything interesting. When you do a regression you need to pick the model that best represents your system.
This is made even worse because this uses benchmarks and coming up with good benchmarks is actually an important part of the AI problem. AI is really good at improving things we can measure so it makes total sense that it will crush any benchmark we throw at it eventually, but there will always be some difference between benchmarks and reality. I would argue that as you are trying to benchmark more subtle things it becomes much harder to make a benchmark. This is just a conjecture on my end but if something like this is possible it means you need to rule it out when modeling AI progress.
There are also economic incentives to always declare percent increases in progress at a regular schedule.
Will AI ever get this advanced? Maybe, maybe even as fast as the author says, but this just isn't a compelling case for it.
Julian Schrittwieser (author of this post) has been in AI for a long time, he was in the core team who worked on AlphaGo, AlphaZero and MuZero at DeepMind, you can see him in the AlphaGo movie. While it doesn't make his opinion automatically true, I think it makes it worth considering, especially since he's a technical person, not a CEO trying to raise money
"extrapolating an exponential" seems dubious, but I think the point is more that there is no clear sign of slowing down in models capabilities from the benchmarks, so we can still expect improvements
Benchmarks are notoriously easy to fake. Also he doesn’t need to be a CEO trying to raise money in order to have an incentive here to push this agenda / narrative. He has a huge stock grant from Anthropic that will go to $0 when the bubble pops
I think the author of this blog is not a heavy user of AI in real life. If you are, you know there things AI is very good at, and thing AI is bad at. AI may see exponential improvements in some aspects, but not in other aspects. In the end, those "laggard" aspects of AI will put a ceiling on its real-world performance.
I use AI in my coding for many hours each day. AI is great. But AI will not replace me in 2026 or in 2027. I have to admit I can't make projections many years in the future, because the pace of progress in AI is indeed breathtaking. But, while I am really bullish on AI, I am skeptical of claims that AI will be able to fully replace a human any time soon.
I am an amateur programmer and tried to port a python 2.7 library to python 3 with GPT5 a few weeks ago.
After a few tries, I realized both myself and the model missed that a large part of the library is based on another library that was never ported to 3 either.
That doesn't stop GPT5 from trying to write the code as best it can with a library that doesn't exist for python 3.
That is the part we have made absolutely no progress on.
Of course, it can do a much better react crud app than in Sept 2023.
In one sense, LLMs are so amazing and impressive and quite fugazi in another sense.
The sentiment of the comments here seems rather pessimistic. A perspective that balances both sides might be that the rate of mass adoption of some technology often lags behind the frontier capabilities, so I wouldn’t expect AI to take over a majority of those jobs in GPDval in a couple of years, but it’ll probably happen eventually.
There are still fundamental limitations in both the model and products using the model that restrict what AI is capable of, so it’s simultaneously true that AI can do cutting edge work in certain domains for hours while vastly underperforming in other domains for very small tasks. The trajectory of improvement of AI capabilities is also an unknown, where it’s easy to overestimate exponential trends due to unexpected issues arising but also easy to underestimate future innovations.
I don’t see the trajectory slowing down just yet with more compute and larger models being used, and I can imagine AI agents will increasingly give their data to further improve larger models.
Aside from the S-versus-exp issue, this area is one of these things where there's a kind of disconnect between my personal professional experience with LLMs and the criteria measures he's talking about. LLMs to me have this kind of superficially impressive feel where it seems impressive in its capabilities, but where, when it fails, it fails dramatically, in a way humans never would, and it never gets anywhere near what's necessary to actually be helpful on finishing tasks, beyond being some kind of gestalt template or prototype.
I feel as if there needs to be a lot more scrutiny on the types of evaluation tasks being provided — whether they are actually representative of real-world demands, or if they are making them easy to look good, and also more focus on the types of failures. Looking through some of the evaluation tasks he links to I'm more familiar with, they seem kind of basic? So not achieving parity with human performance is more significant than it seems. I also wonder, in some kind of maxmin sense, whether we need to start focusing more on worst-case failure performance rather than best-case goal performance.
LLMs are really amazing in some sense, and maybe this essay makes some points that are important to keep in mind as possibilities, but my general impression after reading it is it's kind of missing the core substance of AI bubble claims at the moment.
A lot of this post relies on the recent open ai result they call GDPval (link below). They note some limitations (lack of iteration in the tasks and others) which are key complaints and possibly fundamental limitations of current models.
But more interesting is the 50% win rate stat that represents expert human performance in the paper.
That seems absurdly low, most employees don’t have a 50% success rate on self contained tasks that take ~1 day of work. That means at least one of a few things could be true:
1. The tasks aren’t defined in a way that makes real world sense
2. The tasks require iteration, which wasn’t tested, for real world success (as many tasks do)
I think while interesting and a very worthy research avenue, this paper is only the first in a still early area of understanding how AI will affect with the real world, and it’s hard to project well from this one paper.
That's not 50% success rate at completing the task, that's the win rate of a head-to-head comparison of an algorithm and an expert. 50% means the expert and the algorithm each "win" half the time.
For the METR rating (first half of the article), it is indeed 50% success rate at completing the task. The win rate only applies to the GDPval rating (second half of the article).
To the people who claim that we’re running out of data, I would just say: the world is largely undigitized. The Internet digitized a bunch of words but not even a tiny fraction of all that humans express every day. Same goes for sound in general. CCTV captures a lot of images, far more than social media, but it is poorly processed and also just a fraction of the photons bouncing off objects on earth. The data part of this equation has room to grow.
The 50% success rate is the problem. It means you can’t reliably automate tasks unattended. That seems to be where it becomes non-exponential. It’s like having cars that go twice as far as the last year but will only get you to your destination 50% of the time.
> It’s like having cars that go twice as far as the last year but will only get you to your destination 50% of the time
Nice analogy. All human progress is based on tight-abstractions describing a well-defined machine model. Leaky abstractions with an undefined machine are useful too but only as recommendations or for communication. It is harder to build on top of them. Precisely why programming in english is a non-starter - or - just using english in math/science instead of formalism.
> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.
The difference between exponential and sigmoid is often a surprise to the believers, indeed.
>they somehow jump to the conclusion that AI will never be able to do these tasks at human level
I don’t see that, I mostly see AI criticism that it’s not up to the hype, today. I think most people know it will approach human ability, we just don’t believe the hype that it will be here tomorrow.
I’ve lived through enough AI winter in the past to know that the problem is hard, progress is real and steady, but we could see a big contraction in AI spending in a few years if the bets don’t pay off well in the near term.
The money going into AI right now is huge, but it carries real risks because people want returns on that investment soon, not down the road eventually.
> Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy
Integration into the economy takes time and investment. Unfortunately, ai applications dont have an easy adoption curve - except for the chatbot. Every other use case requires an expensive and risky integration into an existing workflow.
> By the end of 2027, models will frequently outperform experts on many tasks
fixed tasks like tests - maybe. But, the real world is not a fixed model. It requires constant learning through feedback.
Somewhat missed by many comments proclaiming that it’s sigmoidal is that sigmoid curves exhibit significant growth after it stops looking exponential. Unless you think things have already hit a dramatic wall you should probably assume further growth.
We should probably expect compute to get cheaper at the same time, so that’s performance increases with lowering costs. Even after things flatline for performance you would expect lowering costs of inference.
Without specific evidence it’s also unlikely you randomly pick the point on a sigmoid where things change.
There’s no exponential improvement in go or chess agents, or car driving agents. Even tiny mouse racing.
If there is, it would be such nice low hanging fruit.
Maybe all of that happens all at once.
I’d just be honest and say most of it is completely fuzzy tinkering disguised as intellectual activity (yes, some of it is actual intellectual activity and yes we should continue tinkering)
There are rare individuals that spent decades building up good intuition and even that does not help much.
> Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy:
> Models will be able to autonomously work for full days (8 working hours) by mid-2026.
At least one model will match the performance of human experts across many industries before the end of 2026.
> By the end of 2027, models will frequently outperform experts on many tasks.
First commandment of tech hype: the pivotal, groundbreaking singularity is always just 1-2 years away.
I mean seriously, why is that? Even when people like OP try to be principled and use seemingly objective evaluation data, they find that the BIG big thing is 1-2 years away.
Self driving cars? 1-2 years away.
AR glasses replacing phones? 1-2 years away.
All of us living our life in the metaverse? 1-2 years away.
Again, I have to commend OP on putting in the work with the serious graphs, but there’s something more at play here.
Is it purely a matter of data cherry picking? Is it the unknowns unknowns leading to the data driven approaches being completely blind to their medium/long term limitations?
Many people seem to assert that "constant relative growth in capabilities/sales/whatever" is a totally reasonable (or even obvious or inevitable) prior assumption, and then point to "OMG relative growth produces an exponential curve!" as the rest of their argument. And at least the AI 2027 people tried to one-up that by asserting an increasing relative growth rate to produce a superexponential curve.
I'd be a fool to say that we'll ever hit a hard plateau in AI capabilities, but I'll have a hard time believing any projected exponential-growth-to-infinity until I see it with my own eyes.
Self driving cars have existed for at least a year now. It only took a decade of “1 years away” but it exists now, and will likely require another decade of scaling up the hardware.
I think AGI is going to follow a similar trend. A decade of being “1 years away”. Meanwhile, unlike self driving the industry is preemptively solving the scaling up of hardware concurrently.
Because I need to specify an amount of time short enough that big investors will hand over a lot of money, long enough that I can extract a big chunk of it for myself before it all comes crashing down.
A couple of years is probably a bit tight, really, but I'm competing for that cash with other people so the timeframe we make up is going to about the lowest we think we can get away with.
I feel like there should be some take away from the fact that we have to come up with new and interesting metrics like “Length of a Task That Can Be Automated” in order to point out that exponential growth is still happening. Fwiw, it does seem like a good metric, but it also feels like you can often find some metric that’s improving exponentially even when the base function is leveling out.
It's the only benchmark I know of with a well-behaved scale. Benchmarks with for example a score from 0-100% get saturated quite quickly, and further improvements on the metric are literally impossible. And even excluding saturation, they just behave very oddly at the extremes. To use them to show long term exponential growth you need to chain together benchmarks, which is hard to make look credible.
This doesn't feel at all credible because we're already well into the sigmoid part of the curve. I thought the gpt5 thing made it pretty obvious to everyone.
I'm bullish on AI, I don't think we've even begun to understand the product implications, but the "large language models are in context learners" phase has for now basically played out.
This extrapolates based on a good set of data points to predict when AI will reach significant milestones like being able to “work on tasks for a full 8 hours” (estimates by 2026). Which is ok - but it bears keeping https://xkcd.com/605/ in mind when doing extrapolation.
117 comments so far, and the word economics does not appear.
Any technology which produces more results for more inputs but does not get more efficient at larger scale runs into a money problem if it does not get hit by a physics problem first.
It is quite possible that we have already hit the money problem.
So the author is in a clear conflict of interest with the contents of the blog because he's an employee of Anthropic. But regarding this "blog", showing the graph where OpenAI compares "frontier" models and shows gpt-4o vs o3-high is just disingenuous, o1 vs o3 would have been a closer fight between "frontier" models. Also today I learned that there are people paid to benchmark AI models in terms of how close they are to "human" level, apparently even "expert" level whatever that means. I'm not a LLM hater by any means, but I can confidently say that they aren't experts in any fields.
Even if the computational power evolve exponentially, we need to evaluate the utility of additional computations. And if the utility happens to increase logarithmically with computation spend, it's possible that in the end, we will observe just a linear increase in utility.
AI company employee whose livelihood depends on people continuing to pump money into AI writes a blog post trying to convince people to keep pumping more money into AI. Seems solid.
The "exponential" metric/study they include is pretty atrocious. Measuring AI capability by how long humans would take to do the task. By that definition existing computers are already super AGI - how long would it take humans to sort a list of a million numbers? Computers can do it in a fraction of a second. I guess that proves they're already AGI, right? You could probably fit an exponential curve to that as well, before LLMs even existed.
I don't think I have ever seen a page on HN where so many people missed the main point.
The phenomenon of people having trouble understanding the implications of exponential progress is really well known. Well known, I think, by many people here.
And yet an alarming number of comments here interpret small pauses as serious trend breakers. False assumptions that we are anywhere near the limits of computing power relative to fundamental physics limits. Etc.
Recent progress, which is unprecedented in speed looking backward, is dismissed because people have acclimatized to change so quickly.
The title of the article "Failing to Understand the Exponential, Again" is far more apt than I could have imagined, on HN.
See my other comments here for specific arguments. See lots of comments here for examples of those who are skeptical of a strong inevitability here.
The "information revolution" started the first time design information was separated from the thing it could construct. I.e. the first DNA or perhaps RNA life. And it has unrelentingly accelerated from there for over 4.5 billion years.
The known physics limits of computation per gram are astronomical. We are nowhere near any hard limit. And that is before any speculation of what could be done with the components of spacetime fragments we don't understand yet. Or physics beyond that.
The information revolution has hardly begun.
With all humor, this was the last place I expected people to not understand how different information technology progresses vs. any other kind. Or to revert to linear based arguments, in an exponentially relevant situation.
If there is any S-curve for information technology in general, it won't be apparent until long after humans are a distant memory.
I'm a little surprised too. A lot of the arguments are along the lines of but LLMs aren't very good. But really LLMs are a brief phase in the information revolution you mention that will be superseded.
To me saying we won't get AGI because LLMs aren't suitable is like saying we were not going to get powered flight because steam engines weren't suitable. Fair enough they weren't but they got modified into internal combustion engines and then were. Something like that will happen.
I think the first comment on the article put it best: With COVID, researchers could be certain that exponential growth was taking place because they knew the underlying mechanisms of the growth. The virus was self-replicating, so the more people were already infected, the faster would new infections happen.
(Even this dynamic would only go on for a certain time and eventual slow down, forming an S-curve, when the virus could not find any more vulnerable persons to continue the rate of spread. The critical question was of course if this would happen because everyone was vaccinated or isolated enough to prevent infection - or because everyone was already infected or dead)
With AI, there is no such underlying mechanism. There is the dream of the "self-improving AI" where either humans can make use of the current-generation AI to develop the next-generation AI in a fraction of the time - or where the AI simply creates the next generation on its own.
If this dream were reality, it could be genuine exponential growth, but from all I know, it isn't. Coding agents speed up a number of bespoke programming tasks, but they do not exponentially speed up development of new AI models. Yes, we can now quickly generate large corpora of synthetic training data and use them for distillation. We couldn't do that before - but a large part of the training data discussion is about the observation that synthetic data can not replace real data, so data collection remains a bottleneck.
There is one point where a feedback loop does happen, and this is with the hype curve: Initial models produced extremely impressive results compared to everything we had before - there caused an enormous hype and unlocked investments that allowed more resources for the developed of the next model - which then delivered even better results. But it's obvious that this kind of feedback loop will eventually end when no more additional capital is available and diminishing returns set in.
Then we will once again be in the upper part of the S-curve.
Another 'numbrr go up' analyst. Yes, models are objectively better at tasks. Please include the fact that hundreds of billions of dollars are being poured into making them better. You could even call it a technology race. Once the money avalanche runs it's course, I and many others expect 'the exponential' to be followed by an implosion or correction in growth. Data and training is not what LLMs crave. Piles of cash is what LLMs crave.
IMO this approach ultimately asks the wrong question. Every exponential trend in history has eventually flattened out. Every. single. one. Two rabbits would create a population with a mass greater than the Earth in a couple of years if that trend continues indefinitely. The left hand side of a sigmoid curve looks exactly like exponential growth to the naked eye... until it nears the inflection point at t=0. The two curves can't be distinguished when you only have noisy data from t<0.
A better question is, "When will the curve flatten out?" and that can only be addressed by looking outside the dataset for which constraints will eventually make growth impossible. For example, for Moore's law, we could examine as the quantum limits on how small a single transistor can be. You have to analyze the context, not just do the line fitting exercise.
The only really interesting question in the long term is if it will level off at a level near, below, or above human intelligence. It doesn't matter much if that takes five years or fifty. Simply looking at lines that are currently going up and extending them off the right side of the page doesn't really get us any closer to answering that. We have to look at the fundamental constraints of our understanding and algorithms, independent of hardware. For example, hallucinations may be unsolvable with the current approach and require a genuine paradigm shift to solve, and paradigm shifts don't show up on trend lines, more or less by definition.
> - Models will be able to autonomously work for full days (8 working hours) by mid-2026.
> - At least one model will match the performance of human experts across many industries before the end of 2026.
> - By the end of 2027, models will frequently outperform experts on many tasks.
I’ve seen a lot of people make predictions like this and it will be interesting to see how this turns out. But my question is, what should happen to a person’s credibility if their prediction turns out to be wrong? Should the person lose credibility for future predictions and we no longer take them seriously? Or is that way too harsh? Should there be reputational consequences for making bad predictions? I guess this more of a general question, not strictly AI-related.
> Should the person lose credibility for future predictions and we no longer take them seriously
If this were the case, almost every sell-side analyst should have been blacklisted by now. Its more about entertainment than facts - sort of like astrology.
Seems like the right place to ask with ML enthusiasts gathered in one place discussing curves and the things that bend them: what's the thing with potential to obsolete transformers and diffusion models? Is it something old people noticed once LLMs blew up? Something new? Something in-between?
> The evaluation tasks are sourced from experienced industry professionals (avg. 14 years' experience), 30 tasks per occupation for a total of 1320 tasks. Grading is performed by blinded comparison of human and model-generated solutions, allowing for both clear preferences and ties.
It's important to carefully scrutinize the tasks to understand they actually reflect tasks that are unique to industry professionals. I just looked quickly at the nursing ones (my wife is a nurse) and half of them were creating presentations, drafting reports, and the like, which is the primary strength of LLMs but a very small portion of nursing duties.
The computer programming tests are more straightforward. I'd take the other ones with a grain of salt for now.
Many of the "people don't understand Exponential functions" posts are ultimately about people not understanding logistic functions. Because most things in reality that seemingly grow exponentially will eventually, unevitably taper off at some point when the cost for continued growth gets so high, accelerated growth can't be supported anymore.
Viruses can only infect so many people for example. If the growth was truly exponential you would need infinite people for it to be truly exponential.
> Again we can observe a similar trend, with the latest GPT-5 already astonishingly close to human performance:
Yes but only if you measure "performance" as "better than the other option more than 50% of the time" which is a terrible way to measure performance, especially for bullshitting AI.
Imagine comparing chocolate brands. One is tastier than the other one 60% of the time. Clear winner right? Yeah except it's also deadly poisonous 5% of the time. Still tastier on average though!
This guy isn’t even wrong. Sure these models are getting faster, but they are barely getter better at actual reasoning, if at all. Who cares if a model can give me a bullshit answer in five minutes instead of ten? It’s still bullshit.
Ah, employee of an AI company is telling us the technology he's working on and is directly financially interested in hyping will... grow forever and be amazing and exponential and take over the world. And everyone who doesn't believe this employee of AI company hyping AI is WRONG about basics of math.
I absolutely would NOT ever expect such a blog post.
That is a complete strawman - you made up forever growth and then argued against it. The OP is saying the in the short term, it makes more sense to assume exponential growth continues instead of thinking it will flatten out any moment now.
> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.
I'm sure people were saying that about commercial airline speeds in the 1970's too.
But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
With LLM's at the moment, the limiting factors might turn out to be training data, cost, or inherent limits of the transformer approach and the fact that LLM's fundamentally cannot learn outside of their context window. Or a combination of all of these.
The tricky thing about S curves is, you never know where you are on them until the slowdown actually happens. Are we still only in the beginning of the growth part? Or the middle where improvement is linear rather than exponential? And then the growth starts slowing...
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
I'd argue all of them. Any true exponential eventually gets to a point where no computer can even store its numerical value. It's a physically absurd curve.
The narrative quietly assumes that this exponential curve can in fact continue since it will be the harbinger of the technological singularity. Seems more than a bit eschatological, but who knows.
If we suppose this tech rapture does happen, all bets are off; in that sense it's probably better to assume the curve is sigmoidal, since the alternative is literally beyond human comprehension.
Barring fully reversible processes as the basis for technology, you still quickly run into energy and cooling constraints. Even with that, you'd have time or energy density constraints. Unlimited exponentials are clearly unphysical.
Yes, this is an accurate description, and also completely irrelevant to the issue at hand.
At the stage of development we are today, no one cares how fast it takes for the exponent to go from eating our galaxy to eating the whole universe, or whether it'll break some energy density constraint before it and leave a gaping zero-point energy hole where our local cluster used to be.
It'll stop eventually. What we care about is whether it stops before it breaks everything for us, here on Earth. And that's not at all a given. Fundamental limits are irrelevant to us - it's like worrying that putting too many socks in a drawer will eventually make them collapse into a black hole. The limits that are relevant to us are much lower, set by technological, social and economic factors. It's much harder to say where those limits lay.
Sure, but it reminds us that we are dealing with an S-curve, so we need to ask where the inflection point is. i.e. what are the relevant constraints, and can they reasonably sustain exponential growth for a while still? At least as an outsider, it's not obvious to me whether we won't e.g. run into bandwidth or efficiency constraints that make scaling to larger models infeasible without reimagining the sorts of processors we're using. Perhaps we'll need to shift to analog computers or something to break through cooling problems, and if the machine cannot find the designs for the new paradigm it needs, it can't make those exponential self-improvements (until it matches its current performance within the new paradigm, it gets no benefit from design improvements it makes).
My experience is that "AI can write programs" is only true for the smallest tasks, and anything slightly nontrivial will leave it incapable of even getting started. It doesn't "often makes mistakes or goes in a wrong direction". I've never seen it go anywhere near the right direction for a nontrivial task.
That doesn't mean it won't have a large impact; as an autocomplete these things can be quite useful today. But when we have a more honest look at what it can do now, it's less obvious that we'll hit some kind of singularity before hitting a constraint.
I think the technological singularity has generally been a bit of a metaphor rather than a mathematical singularity.
You clearly haven’t played my idle game.
Indeed you can't be sure. But on the other hand a bunch of the commentariat has been claiming (with no evidence) that we're at the midpoint of the sigmoid for the last three years. They were wrong. And then you had the AI frontier lab insiders who predicted an accelerating pace of progress for the last three years. They were right. Now, the frontier labs rarely (never?) provide evidence either, but they do have about a year of visibility into the pipeline, unlike anyone outside.
So at least my heuristic is to wait until a frontier lab starts warning about diminishing returns and slowdowns before calling the midpoint or multiple labs start winding down capex. The first component might have misaligned incentives, but if we're in a realistic danger of hitting a wall in the next year, the capex spending would not be accelerating the way it is.
Capex requirements might be on a different curve than model improvements.
E.g. you might need to accelerate spending to get sub-linear growth in model output.
If valuations depend on hitting the curves described in the article, you might see accelerating capex at precisely the time improvements are dropping off.
I don’t think frontier labs are going to be a trustworthy canary. If Anthropic says they’re reaching the limit and OpenAI holds the line that AGI is imminent, talent and funding will flee Anthropic for OpenAI. There’s a strong incentive to keep your mouth shut if things aren’t going well.
I think you nailed it. The capex is desperation in the hopes of maintaining the curve. I have heard actual AI researchers say progress is slowing, just not from the big companies directly.
> And then you had the AI frontier lab insiders who predicted an accelerating pace of progress for the last three years.
Progress has most definitely not been happening at an _accelerating_ pace.
There are a few other limitations, in particular how much energy, hardware and funding we (as a society) can afford to throw at the problem, as well as the societal impact.
AI development is currently given a free pass on these points, but it's very unclear how long that will last. Regardless of scientific and technological potential, I believe that we'll hit some form of limit soon.
There's a Mulla Nasrudin joke that's sort of relevant here:
Nasrudin is on a flight, when suddenly the pilot comes on the intercom, saying, "Passengers, we apologize, but we have experienced an engine burn-out. The plane can still fly on the remaining three engines, but we'll be delayed in our arrival by two hours."
Nasrudin speaks up "let's not worry, what's 2 hours really"
A few minutes later, the airplane shakes, and passengers see smoke coming out of another engine. Again, the intercom crackles to life.
"This is your captain speaking. Apologies, but due to a second engine burn-out, we'll be delayed by another two hours."
The passengers are agitated, but the Mulla once again tries to remains calm.
Suddenly, the third engine catches fire. Again, the pilot comes on the intercom and says, "I know you're all scared, but this is a very advanced aircraft, and it can safely fly on only a single engine. But we will be delayed by yet another two hours."
At this, Nasrudin shouts, "This is ridiculous! If one more engine goes, we'll be stuck up here all day"
Yes exponential is only an approximation of the first part of S curves. And this author claims that he understands the exponential better than others…
the author is an anthropic employee
if the money dries up because the investors lose faith on the exponential continuing, then his future looks much dimmer
That is even true for covid for obvious reasons, because Covid runs out of people it can infect at some point.
It's obvious, but the problem was that enough people would die in the process for people to be worried. Similarly, if the current AI will be able to replace 99% of devs in 5-10 years (or even worse, most white collar jobs) and flatten out there without becoming a godlike AGI, it will still have enormous implications for the economy.
Infectious diseases rarely see actual exponential growth for logistical reasons. It's a pretty unrealistic model that ignores that the disease actually needs to find additional hosts to spread, the local availability of which starts to go down from the first victim.
Yes the model where the S curves comes out is extremely simplified. Looking at covid curves we could have well said it was parabolic, but that’s much less worrisome
If you assume the availability of hosts is local to the perimeter of the infected hosts, then the relative growth is limited to 2/R where R is the distance from patient 0 in 2 dimensions. It's becuase an area of the circle defines how many hosts are already ill but the interaction can only happen on the perimeter of the circle.
The disease is obviously also limited by the total amount of hosts, but I assume there's also the "bottom" limit - i.e. the resource consumption of already-infected hosts.
It also depends on how panicked people are. Covid was never going to spread like ebola, for instance: it was worse. Bad enough to harm and kill people, but not bad enough to scare them into self-enforced isolation and voluntary compliance with public health measures.
Back on the subject of AI, I think the flat part of the curve has always been in sight. Transformers can achieve human performance in some, even many respects, but they're like children who have to spend a million years in grade school to learn their multiplication tables. We will have to figure out why that is the case and how to improve upon it drastically before this stuff really starts to pay off. I'm sure we will but we'll be on a completely different S-shaped curve at that point.
> a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
Yes of course it’s not going to increase exponentially forever.
The point is, why predict that the growth rate is going to slow exactly now? What evidence are you going to look at?
It’s possible to make informed predictions (eg “Moore’s law can’t get you further than 1nm with silicon due to fundamental physical limits”). But most commenters aren’t basing their predictions in anything as rigorous as that.
And note, there are good reasons to predict a speedup, too; as models get more intelligent, they will be able to accelerate the R&D process. So quality per-researcher is now proportional to the exponential intelligence curve, AND quantity of researchers scales with number of GPUs (rather than population growth which is much slower).
NOTE IN ADVANCE: I'm generalizing, naturally, because talking about specifics would require an essay and I'm trying to write a comment.
Why predict that the growth rate is going to slow now? Simple. Because current models have already been trained on pretty much the entire meaningful part of the Internet. Where are they going to get more data?
The exponential growth part of the curve was largely based on being able to fit more and more training data into the models. Now that all the meaningful training data has been fed in, further growth will come from one of two things: generating training data from one LLM to feed into another one (dangerous, highly likely to lead to "down the rabbit hole forever" hallucinations, and weeding those out is a LOT of work and will therefore contribute to slower growth), or else finding better ways to tweak the models to make better use of the available training data (which will produce growth, but much slower than what "Hey, we can slurp up the entire Internet now!" was producing in terms of rate of growth).
And yes, there is more training data available because the Internet is not static: the Internet of 2025 has more meaningful, human-generated content than the Internet of 2024. But it also has a lot more AI-generated content, which will lead into the rabbit-hole problem where one AI's hallucinations get baked into the next one's training, so the extra data that can be harvested from the 2025 Internet is almost certainly going to produce slower growth in meaningful results (as opposed to hallucinated results).
Curiously, humans don't seem to require reading the entire internet in order to perform at human level on a wide variety of tasks... Nature suggests that there's a lot of headroom in algorithms for learning on existing sources. Indeed, we had models trained on the whole internet a couple years ago, now, yet model quality has continued to improve.
Meanwhile, on the hardware side, transistor counts in GPUs are in the tens of billions and still increasing steadily.
> Where are they going to get more data?
This is a great question, but note that folks were freaking out about this a year or so ago and we seem to be doing fine.
We seem to be making progress with some combination of synthetic training datasets on coding/math tasks, textbooks authored by paid experts, and new tokens (plus preference signals) generated by users of the LLM systems.
It wouldn’t surprise me if coding/math turned out to have a dense-enough loss-landscape to produce enough synthetic data to get to AGI - though I wouldn’t bet on this as a highly likely outcome.
I have been wanting to read/do some more rigorous analysis here though.
This sort of analysis would count as the kind of rigorous prediction that I’m asking for above.
E2A: initial exploration on this: https://chatgpt.com/share/68d96124-a6f4-8006-8a87-bfa7ee4ea3...
Gives some relevant papers such as
https://arxiv.org/html/2211.04325v2#:~:text=3.1%20AI
I am extremely confident that AGI, if it is achievable at all (which is a different argument and one I'm not getting into right now), requires a world model / fact model / whatever terminology you prefer, and is therefore not achievable by models that simply chain words together without having any kind of understanding baked into the model. In other words, LLMs cannot lead to AGI.
Agreed, it surely does require a world-model.
I disagree that generic LLMs plus CoT/reasoning/tool calling (ie the current stack) cannot in principle implement a world model.
I believe LLMs are doing some sort of world modeling and likely are mostly lacking a medium-/long-term memory system in which to store it.
(I wouldn’t be surprised if one or two more architectural overhauls end up occurring before AGI, I also wouldn’t be surprised if these occurred seamlessly with our current trajectory of progress)
Alternative argument, there is no need for more training data, just better algorithms. Throwing more tokens at the problem doesn't solve the fact that training llms using supervised learning is a poor way to integrate knowledge. We have however seen promising results coming out of reinforcement learning and self play. Which means that anthropic and openais' bet on scale is likely a dead end, but we may yet see capability improvements coming from other labs, without the need for greater data collection.
Better algorithms is one of the things I meant by "better ways to tweak the models to make better use of the available training data". But that produces slower growth than the jaw-droppingly rapid growth you can get by slurping pretty much the whole Internet. That produced the sharp part of the S curve, but that part is behind us now, which is why I assert we're approaching the slower-growth part at the top of the curve.
> The point is, why predict that the growth rate is going to slow exactly now? What evidence are you going to look at?
Why predict that the (absolute) growth rate is going to keep accelerating past exactly now?
Exponential growth always assumes a constant relative growth rate, which works in the fiction of economics, but is otherwise far from an inevitability. People like to point to Moore's law ad nauseam, but other things like "the human population" or "single-core performance" keep accelerating until they start cooling off.
> And note, there are good reasons to predict a speedup, too; as models get more intelligent, they will be able to accelerate the R&D process.
And if heaven forbid, R&D ever turns out to start taking more work for the same marginal returns on "ability to accelerate the process", then you no longer have an exponential curve. Or for that matter, even if some parts can be accelerated to an amazing extent, other parts may get strung up on Amdahl's law.
It's fine to predict continued growth, and it's even fine to predict that a true inflection point won't come any time soon, but exponential growth is something else entirely.
> Why predict that the (absolute) growth rate is going to keep accelerating past exactly now?
By following this logic you should have predicted Moore’s law would halt every year for the last five decades. I hope you see why this is a flawed argument. You prove too much.
But I will answer your “why”: plenty of exponential curves exist in reality, and empirically, they can last for a long time. This is just how technology works; some exponential process kicks off, then eventually is rate-limited, then if we are lucky another S-curve stacks on top of it, and the process repeats for a while.
Reality has inertia. My hunch is you should apply some heuristic like “the longer a curve has existed, the longer you should bet it will persist”. So I wouldn’t bet on exponential growth in AI capabilities for the next 10 years, but I would consider it very foolish to use pure induction to bet on growth stopping within 1 year.
And to be clear, I think these heuristics are weak and should be trumped by actual physical models of rate-limiters where available.
> By following this logic you should have predicted Moore’s law would halt every year for the last five decades. I hope you see why this is a flawed argument. You prove too much.
I do think it's continually amazing that Moore's law has continued in some capacity for decades. But before trumpeting the age of exponential growth, I'd love to see plenty of examples that aren't named "Moore's law": as it stands, one easy hypothesis is that "ability to cram transistors into mass-produced boards" lends itself particularly well to newly-discovered strategies.
> So I wouldn’t bet on exponential growth in AI capabilities for the next 10 years, but I would consider it very foolish to use pure induction to bet on growth stopping within 1 year.
Great, we both agree that it's foolish to bet on growth stopping within 1 year. What I'm saying that "growth doesn't stop" ≠ "growth is exponential".
A theory of "inertia" could just as well support linear growth: it's only because we stare at relative growth rates that we treat exponential growth as a "constant" that will continue in the absence of explicit barriers.
Sorry, to be clear I was making the stronger claim:
I would consider it very foolish to use pure induction to bet on _exponential_ growth stopping within 1 year.
I think you can easily find plenty of other long-lasting exponential curves. A good starting point would be:
https://en.m.wikipedia.org/wiki/Progress_studies
With perhaps the optimistic case as
https://en.m.wikipedia.org/wiki/Accelerating_change
This is where I’d really like to be able to point to our respective Manifold predictions on the subject; we could circle back in a year’s time and review who was in fact correct. I wager internet points it will be me :)
Concretely, https://manifold.markets/JoshYou/best-ai-time-horizon-by-aug...
Solar panel cost per watt has been dropping exponentially for decades as well...
Partly these are matters of economies of scale - reduction in production costs at scale - and partly it's a matter of increasing human attention leading to steady improvements as the technology itself becomes more ubiquitous.
It has already been trained on all the data. The other obvious next step is to increase context window, but that's apparently very hard/costly.
Yeah exactly!
It’s likely that it will slow down at some point, but the highest likelihood scenario for the near future is that scaling will continue.
> why predict that the growth rate is going to slow exactly now?
why predict that it will continue? Nobody ever actually makes an argument that growth is likely to continue, they just extrapolate from existing trends and make a guess, with no consideration of the underlying mechanics.
Oh, go on then, I'll give a reason: this bubble is inflated primarily by venture capital, and is not profitable. The venture capital is starting to run out, and there is no convincing evidence that the businesses will become profitable.
Progress in information systems cannot be compared to progress in physical systems.
For starters, physical systems compete for limited resources and labor.
For another, progress in software vastly reduces the cost of improved designs. Whereas progress in physical systems can enable but still increase the cost of improved designs.
Finally, the underlying substrate of software is digital hardware, which has been improving in both capabilities and economics exponentially for almost 100 years.
Looking at information systems as far back as the first coordination of differentiating cells to human civilization is one of exponential improvement. Very slow, slow, fast, very fast. (Can even take this further, to first metabolic cycles, cells, multi-purpose genes, modular development genes, etc. Life is the reproduction of physical systems via information systems.)
Same with human technological information systems, from cave painting, writing, printing, telegraph, phone, internet, etc.
It would be VERY surprising if AI somehow managed to fall off the exponential information system growth path. Not industry level surprising, but "everything we know about how useful information compounds" level surprising.
> Looking at information systems as far back as the first coordination of differentiating cells to human civilization is one of exponential improvement.
Under what metric? Most of the things you mention don't have numerical values to plot on a curve. It's a vibe exponential, at best.
Life and humans have become better and better at extracting available resources and energy, but there's a clear limit to that (100%) and the distribution of these things in the universe is a given, not something we control. You don't run information systems off empty space.
> It's a vibe exponential, at best.
I am a little stunned you think so.
Life has been on Earth about 3.5-3.8 billion years.
Break that into 0.5-0.8, 1 billion, 1 billion, 1 billion "quarters", and you will find exponential increases in evolutions rate of change and production of diversity across them by many many objective measures.
Now break up the last 1 billion into 100 million year segments. Again exponential.
Then break up the last 100 million into segments. Again.
Then the last 10 million years into segments, and watch humans progress.
The last million, in 100k year segments, watch modern humans appear.
the last 10k years into segments, watch agriculture, civilizations, technology, writing ...
The last 1000 years, incredible aggregation of technology, math, and the appearance of formal science
last 100 years, gets crazy. Information systems appear in labs, then become ubiquitous.
last 10 years, major changes, AI starts having mainstream impact
last 1 year - even the basic improvements to AI models in the last 12 months are an unprecedented level of change, per time, looking back.
I am not sure how any of could appear "vibe", given any historical and situational awareness.
This progression is universally recognized. Aside from creationists and similar contingents.
I am curious when you think we will run out of atoms to make information systems.
How many billions of years you think that might take.
Of all the things to be limited by, that doesn't seem like a near term issue. Just an asteroid or two alone will provide resources beyond our dreams. And space travel is improving at a very rapid rate.
In the meantime, in terms of efficiency of using Earth atoms for information processing, there is still a lot space at the "bottom", as Feynman said. Our crude systems are limited today by their power waste. Small energy efficient systems, and more efficient heat shedding, will enable full 3D chips ("cubes"?) and vastly higher density of packing those.
The known limit for information for physical systems per gram, is astronomical:
• Bremermann’s limit : 10^47 operations per second, per gram.
Other interesting limits:
• Margolus–Levitin bound - on quantum state evolution
• Landauer’s principle - Thermodynamic cost of erasing (overwriting) one bit.
• Bekenstein bound: Maximum storage by volume.
Life will go through many many singularities before we get anywhere near hard limits.
>[..] to first metabolic cycles, cells, multi-purpose genes, modular development genes, etc.
One example is when cells discovered energy production using mitochondria. Mitochondria add new capabilities to the cell, with (almost) no downside like: weight, temperature-sensitivity, pressure-sensitivity. It's almost 100% upside.
If someone tried to predict the future number of mitochondria-enabled cells from the first one, he could be off by 10^20 less cells.
I am writing a story the last 20 days, with that exact story plot, have to get my stuff together and finish it.
> Progress in information systems cannot be compared to progress in physical systems.
> For starters, physical systems compete for limited resources and labor.
> Finally, the underlying substrate of software is digital hardware…
See how these are related?
By physical systems, I meant systems whose purpose is to do physical work. Mechanical things. Gears. Struts.
Computer hardware is an information system. You are correct that it is has a physical component. But its power comes from its organization (information) not its mass, weight, etc.
Transistors get more powerful, not less, when made from less matter.
Information systems move from substrate to more efficient substrate. They are not their substrate.
They still depend on physical resources and labor. They’re made by people and machines. There’s never been more resources going into information systems than right now, and AI accelerated that greatly. Think of all the server farms being built next to power plants.
Yes. Of course.
All information has a substrate at any given time.
But the amount of computation per resource drops because computation is not something tied to any unit matter. Nor any particular substrate.
It is not the same as a steam engine, which can only be made so efficient.
The amount of both matter and labor per quantity of computing power is dropping exponentially. Right?
See a sibling reply on the physical limits of computation. We are several singularities away from any hard limit.
Evidence: History of industrialization vs. history of computing. Fundamental physics.
That's fallacious reasoning, you are extrapolating from survivorship bias. A lot of technologies, genes, or species have failed along the way. You are also subjectively attributing progression as improvements, which is problematic as well, if you speak about general trends. Evolution selects for adaptation not innovation. We use the theory of evolution to explain the emergence of complexity, but that's not the sole direction and there are many examples where species evolved towards simplicity (again).
Resource expense alone could be the end of AI. You may look up historic island populations, where technological demands (e.g. timber) usually led to extinction by resource exhaustion and consequent ecosystem collapse (e.g. deforestation leading to soil erosion).
See replies to sibling comments.
Doesn't answer the core fallacy. Historical "technological progress" can't be used as argument for any particular technology. Right now, if we are talking about AI, we're talking about specific technologies, which may just as well fail and remain inconsequential in the grand scheme of things, like most technologies, most things really, did in the past. Even more so since we don't understand much anything in either human or artificial cognition. Again and again, we've been wrong about predicting the limits and challenges in computation.
You see, your argument is just bad. You are merely guessing like everyone else.
My arguments are very strong.
Information technology does not operate by the rules of any other technology. It is a technology of math and organization, not particular materials.
The unique value of information technology is that it compounds the value of other information and technology, including its own, and lowers the bar for its own further progress.
And we know with absolute certainty we have barely scratched the computing capacity of matter. Bremermann’s limit : 10^47 operations per second, per gram. See my other comment for other relevant limits.
Do you also expect a wall in mathematics?
And yes, an unbroken historical record of 4.5 billions years of information systems becoming more sophisticated with an exponential speed increase over time, is in fact a very strong argument. Changes that took a billion years initially, now happen in very short times in today's evolution, and essentially instantly in technological time. The path is long, with significant acceleration milestones at whatever scale of time you want to look at.
Your argument, on the other hand, is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.
Substantive negative arguments about AI progress have been made. See "Perceptrons" by Marvin Minksy and Seymour Papert, for an example of what a solid negative argument looks like. It delivered insights. It made some sense at the time.
> Your argument, on the other hand, is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.
Pointing out logical fallacies?
Lol.
> Historical "technological progress" can't be used as argument for any particular technology.
Historical for billions of years of natural information system evolution. Metabolic, RNA, DNA, protein networks, epigenetic, intracellular, intercellular, active membrane, nerve precursors, peptides, hormonal, neural, ganglion, nerve nets, brains.
Thousands of years of human information systems. Hundreds of years of technological information systems. Decades of digital information systems. Now in in just the last few years, progress year to year is unlike any seen before.
Significant innovations being reported virtually every day.
Yes track records carry weight. Especially with no good reason for any reason for a break, while every tangible reason to believe nothing is slowing down, right up to today.
"Past is not a predictor of future behavior" is about asset gains relative to asset prices in markets where predictable gains have had their profitability removed by the predictive pricing of others. A highly specific feedback situation making predicting asset gains less predictable even when companies do maintain strong predictable trends in fundamentals.
It is a narrow specific second order effect.
It is the worst possible argument for anything outside of those special conditions.
Every single thing you have ever learned was predicated on the past having strong predictive qualities.
You should understand what an argument means, before throwing it into contexts where its preconditions don't exist.
> Right now, if we are talking about AI, we're talking about specific technologies, which may just as well fail and remain inconsequential in the grand scheme of things, like most technologies, most things really, did in the past. Even more so since we don't understand much anything in either human or artificial cognition. Again and again, we've been wrong about predicting the limits and challenges in computation.
> Your argument [...] is indistinguishable from cynical AI opinions going back decades. It could be made any time. Zero new insight. Zero predictive capacity.
If I need to be clearer, nobody could know when you wrote that by reading it. It isn't an argument it's a free floating opinion. And you have not made it more relevant today, than it would have been all the decades up till now, through all the technological transitions up until now. Your opinion was equally "applicable", and no less wrong.
This is what "Zero new insight. Zero predictive capacity" refers to.
> Substantive negative arguments about AI progress have been made. See "Perceptrons" by Marvin Minksy and Seymour Papert, for an example of what a solid negative argument looks like. It delivered insights. It made some sense at the time.
Here you go:
https://en.wikipedia.org/wiki/Perceptrons_(book)
The cost of the next number in a GPT (3>4>5) seems to be in 2 ways:
1) $$$
2) data
The second (data) also isn't cheap. As it seems we've already gotten through all the 'cheap' data out there. So much so that synthetic data (fart huffing) is a big thing now. People tell it's real and useful and passes the glenn-horf theore... blah blah blah.
So it really more so comes down to just:
1) $$$^2 (but really pick any exponent)
In that, I'm not sure this thing is a true sigmoid curve (see: biology all the time). I think it's more a logarithmic cost here. In that, it never really goes away, but it gets really expensive to carry out for large N.
[To be clear, lots of great shit happens out there in large N. An AI god still may lurk in the long slow slope of $N, the cure for boredom too, or knowing why we yawn, etc.]
I am getting the sense that the 2nd deriative of the curve is already hitting negative teritory. models get updated, and I don't feel I'm getting better answers from the LLMs.
On the application front though, it feels that the advancements from a couple of years ago are just beginning to trickle down to product space. I used to do some video editing as a hobby. Recently I picked it up again, and was blown away by how much AI has chipped away the repetitive stuff, and even made attempts at the more creative aspects of production, with mixed but promising results.
What are some examples of tasks you no longer have to do?
one example is auto generating subtitles -- elements of this tasks, e.g. speech to text with time coding, have been around for a while (openai whisper and others), but they have only recently been integrated into video editors and become easy to use for non-coders. other examples: depth map (estimating object distance from the camera; this is useful when you want to blur the background), auto-generating masks with object tracking.
Yes. It's true that we don't know, with any certainty, (1) whether we are hitting limits to growth intrinsic to current hardware and software, (2) whether we will need new hardware or software breakthroughs to continue improving models, and (3) what the timing of any necessary breakthroughs, because innovation doesn't happen on a predictable schedule. There are unknown unknowns.[a]
However, there's no doubt that at a global scale, we're sure trying to maintain current rates of improvement in AI. I mean, the scale and breadth of global investment dedicated to improving AI, presently, is truly unprecedented. Whether all this investment is driven by FOMO or by foresight, is irrelevant. The underlying assumption in all cases is the same: We will figure out, somehow, how to overcome all known and unknown challenges along the way. I have no idea what the odds of success may be, but they're not zero. We sure live in interesting times!
---
[a] https://en.wikipedia.org/wiki/There_are_unknown_unknowns
I hope the crash won't be unprecedented as well...
I hope so too. Capital spending on AI appears to be holding up the entire economy:
https://am.jpmorgan.com/us/en/asset-management/adv/insights/...
Each specific technology can be S-shaped, but advancements in achieving goals can still maintain an exponential curve. e.g. Moore's law is dead with the end of Dennard scaling, but computation improvements still happen with parallelism.
Meta's Behemoth shows that scaling number of parameters has diminished returns, but we still have many different ways to continue advancements. Those who point at one thing and say "see", isn't really seeing. Of course there are limits, like energy but with nuclear energy or photon-based computing were nowhere near the limits.
Agreed!
And, maybe I'm missing something, but to me it seems obvious that flat top part of the S curve is going to be somewhere below human ability... because, as you say, of the training data. How on earth could we train an LLM to be smarter than us, when 100% of the material we use to teach it how to think, is human-style thinking?
Maybe if we do a good job, only a little bit below human ability -- and what an accomplishment that would still be!
But still -- that's a far cry from the ideas espoused in articles like this, where AI is just one or two years away from overtaking us.
Author here.
The standard way to do this is Reinforcement Learning: we do not teach the model how to do the task, we let it discover the _how_ for itself and only grade it based on how well it did, then reinforce the attempts where it did well. This way the model can learn wildly superhuman performance, e.g. it's what we used to train AlphaGo and AlphaZero.
It never ceases to amaze me how people consistently mistake the initial phase of a sigmoid curve for an exponential function.
> I'm sure people were saying that about commercial airline speeds in the 1970's too.
Or CPU frequencies in the 1990's. Also we spent quite a few decades at the end of the 19th century thinking that physics was finished.
I'm not sure that explaining it as an "S curve" is really the right metaphor either, though.
You get the "exponential" growth effect when there's a specific technology invented that "just needs to be applied", and the application tricks tend to fall out quickly. For sure generative AI is on that curve right now, with everyone big enough to afford a datacenter training models like there's no tomorrow and feeding a community of a million startups trying to deploy those models.
But nothing about this is modeled correctly as an "exponential", except in the somewhat trivial sense of "the community of innovators grows like a disease as everyone hops on board". Sure, the petri dish ends up saturated pretty quickly and growth levels off, but that's not really saying much about the problem.
> I'm sure people were saying that about commercial airline speeds in the 1970's too.
They were also saying that about CPU clock speeds.
> I'm sure people were saying that about commercial airline speeds in the 1970's too.
They'd be wrong, of course - for not realizing demand is a limiting factor here. Airline speeds plateaued not because we couldn't make planes go faster anymore, but because no one wanted them to go faster.
This is partially economical and partially social factor - transit times are bucketed by what they enable people to do. It makes little difference if going from London to New York takes 8 hours instead of 12 - it's still in the "multi-day business trip" bucket (even 6 hours goes into that bucket, once you add airport overhead). Now, if you could drop that to 3 hours, like Concorde did[0], that finally moves it into "hop over for a meet, fly back the same day" bucket, and then business customers start paying attention[1].
For various technical, legal and social reasons, we didn't manage to cross that chasm before money for R&D dried out. Still, the trend continued anyway - in military aviation and, later, in supersonic missiles.
With AI, the demand is extreme and only growing, and it shows no sign of being structured into classes with large thresholds between them - in fact, models are improving faster than we're able to put them to any use; even if we suddenly hit a limit now and couldn't train even better models anymore, we have decades of improvements to extract just from learning how to properly apply the models we have. But there's no sign we're about to hit a wall with training any time soon.
Airline speeds are inherently a bad example for the argument you're making, but in general, I don't think pointing out S-curves is all that useful. As you correctly observe:
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
But, what happens when one technology - or rather, one metric of that technology - stops improving? Something else starts - another metric of that technology, or something built on top of it, or something that was enabled by it. The exponent is S-curves on top of S-curves, all the way down, but how long that exponent is depends on what you consider in scope. So, a matter of accounting. So yeah, AI progress can flatten tomorrow or continue exponentially for the next couple years - depending on how narrowly you define "AI progress".
Ergo, not all that useful.
--
[0] - https://simpleflying.com/concorde-fastest-transatlantic-cros...
[1] - This is why Elon Musk wasn't immediately laughed out of the room after proposing using Starship for moving people and cargo across the Earth, back in 2017. Hopping between cities on an ICBM sounds borderline absurd for many reasons, but it also promised cutting flight time to less than one hour between any two points on Earth, which put it a completely new bucket, even more interesting for businesses.
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should remain optimistic and milk that while we can lest pessimism cut off investment top early.
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should milk that while we can.
"I'm sure people were saying that about commercial airline speeds in the 1970's too."
But there are others that keep going also. Moore's law is still going (mostly, slowing), and made it past a few pinch points where people thought it was the end.
The point is, that over 30 decades, many people said Moore's law was at an end, and then it wasn't, there was some breakthrough that kept it going. Maybe a new one will happen.
The thing with AI is, maybe the S curve flattens out , after all the jobs are gone.
Everyone is hoping the S curve flattens out somewhere just below human level, but what if it flattens out just beyond human level? We're still screwed.
There’s a key way to think about a process that looks exponential and might or might not flatten out into an S curve: reasoning about fundamental limits. For COVID it would obviously flatten out because there are finite humans, and it did when the disease had in fact infected most humans on the planet. For commercial airlines you could reason about the speed of sound or escape velocity and see there is again a natural upper limit- although which of those two would dominate would have very different real world implications.
For computational intelligence, we have one clear example of an upper limit in a biological human brain. It only consumes about 25W and has much more intelligence than today’s LLMs in important ways. Maybe that’s the wrong limit? But Moore’s law has been holding for a very long time. And smart physicists like Feynman in his seminal lecture predicting nanotechnology in 1959 called “there’s plenty of room at the bottom” have been arguing that we are extremely far from running into any fundamental physical limits on the complexity of manufactured objects. The ability to manufacture them we presume is limited by ingenuity, which jokes aside shows no signs of running out.
Training data is a fine argument to consider. Especially since there are training on “the whole internet” sorta. The key breakthrough of transformers wasn’t in fact autoregressive token processing or attention or anything like that. It was that they can learn from (memorize / interpolate between / generalize) arbitrary quantities of training data. Before that every kind of ML model hit scaling limits pretty fast. Resnets got CNNs to millions of parameters but they still became quite difficult to train. Transformers train reliably on every size data set we have ever tried with no end in sight. The attention mechanism shortens the gradient path for extremely large numbers of parameters, completely changing the rules of what’s possible with large networks. But what about the data to feed them?
There are two possible counter arguments there. One is that humans don’t need exabytes of examples to learn the world. You might reasonably conclude from this that NNs have some fundamental difference vs people and that some hard barrier of ML science innovation lies in the way. Smart scientists like Yann LeCun would agree with you there. I can see the other side of that argument too - that once a system is capable of reasoning and learning it doesn’t need exhaustive examples to learn to generalize. I would argue that RL reasoning systems like GRPO or GSPO do exactly this - they let the system try lots of ways to approach a difficult problem until they figure out something that works. And then they cleverly find a gradient towards whatever technique had relative advantage. They don’t need infinite examples of the right answer. They just need a well chosen curriculum of difficult problems to think about for a long time. (Sounds a lot like school.) Sometimes it takes a very long time. But if you can set it up correctly it’s fairly automatic and isn’t limited by training data.
The other argument is what the Silicon Valley types call “self play” - the goal of having an LLM learn from itself or its peers through repeated games or thought experiments. This is how Alpha Go was trained, and big tech has been aggressively pursuing analogs for LLMs. This has not been a runaway success yet. But in the area of coding agents, arguably where AI is having the biggest economic impact right now, self play techniques are an important part of building both the training and evaluation sets. Important public benchmarks here start from human curated examples and algorithmically enhance them to much larger sizes and levels of complexity. I think I might have read about similar tricks in math problems but I’m not sure. Regardless it seems very likely that this has a way to overcome any fundamental limit on availability of training data as well, based on human ingenuity instead.
Also, if the top of the S curve is high enough, it doesn’t matter that it’s not truly exponential. The interesting stuff will happen before it flattens out. E.g. COVID. Consider the y axis “human jobs replaced by AI” instead of “smartness” and yes it’s obviously an S curve.
> For computational intelligence, we have one clear example of an upper limit in a biological human brain. It only consumes about 25W and has much more intelligence than today’s LLMs in important ways. Maybe that’s the wrong limit?
It's a good reference point, but I see no reason for it to be an upper limit - by the very nature of how biological evolution works, human brains are close to the worst possible brains advanced enough to start a technological revolution. We're the first brain on Earth that crossed that threshold, and in evolutionary timescales, all that followed - all human history - happened in an instant. Evolution didn't have time yet to iterate on our brain design.
> But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.
S curves are exponential before they start tapering off though. It's hard to predict how long that could continue, so there's an argument to be made that we should remain optimistic and milk that while we can lest pessimism cut off investment too early.
Just because something exhibits an exponential growth at one point in time, that doesn’t mean that a particular subject is capable of sustaining exponential growth.
Their Covid example is a great counter argument to their point in that covid isn’t still growing exponentially.
Where the AI skeptics (or even just pragmatists, like myself) chime in is saying “yeah AI will improve. But LLMs are a limited technology that cannot fully bridge the gap between what they’re producing now, and what the “hypists” claim they’ll be able to do in the future.”
People like Sam Altman know ChatGPT is a million miles away from AGI. But their primary goal is to make money. So they have to convince VCs that their technology has a longer period of exponential growth than what it actually will have.
Author here.
The argument is not that it will keep growing exponentially forever (obviously that is physically impossible), rather that:
- given a sustained history of growth along a very predictable trajectory, the highest likelihood short term scenario is continued growth along the same trajectory. Sample a random point on an s-curve and look slightly to the right, what’s the most common direction the curve continues?
- exponential progress is very hard to visualize and see, it may appear to hardly make any progress while far away from human capabilities, then move from just below to far above human very quickly
My point is that the limits of LLMs will be hit long before we they start to take on human capabilities.
The problem isn’t that exponential growth is hard to visualise. The problem is that LLMs, as advanced and useful a technique as it is, isn’t suited for AGI and thus will never get us even remotely to the stage of AGI.
The human like capabilities are really just smoke and mirrors.
It’s like when people anthropomorphisise their car; “she’s being temperamental today”. Except we know the car is not intelligence and it’s just a mechanical problem. Whereas it’s in the AI tech firms best interest to upsell the human-like characteristics of LLMs because that’s how they get VC money. And as we know, building and running models isn’t cheap.
There is no particular reason why AI has to stick to language models though. Indeed if you want human like thinking you pretty much have to go beyond language as we do other stuff too if you see what I mean. A recent example: "Google DeepMind unveils its first “thinking” robotics AI" https://arstechnica.com/google/2025/09/google-deepmind-unvei...
> There is no particular reason why AI has to stick to language models though.
There’s no reason at all. But that’s not the technology that’s in the consumer space, growing exponentially, gaining all the current hype.
So at this point in time, it’s just a theoretical future that will happen inevitably but we don’t know when. It could be next year. It could be 10 years. It could be 100 years or more.
My prediction is that current AI tech plateaus long before any AGI-capable technology emerges.
That feels like you're moving the goal posts a bit.
Exponential growth over the short term is very uninteresting. Exponential growth is exciting when it can compound.
E.g. if i offered you an investing opportunity 500% / per year compounded daily - that's amazing. If the fine print is that that rate will only last for the very near term (say a week), then it would be worse than a savings account.
Well, growth has been on this exponential already for 5+ years (for the METR eval), and we are at the point where models are very close to matching human expert capabilities in many domains - only one or two more years of growth would put us well beyond that point.
Personally I think we'll see way more growth than that, but to see profound impacts on our economy you only need to believe the much more conservative assumption of a little extra growth along the same trend.
> we are at the point where models are very close to matching human expert capabilities in many domains
That's a bold claim. I don't think it matches most people's experiences.
If that was really true people wouldn't be talking about exponential growth. You don't need exponential growth if you are already almost at your destination.
Which domains?
What I’ve seen is that LLMs are very good at simulating an extremely well read junior.
Models know all the tricks but not when to use them.
And because of that, you’re continually have to hand hold them.
Working with an LLM is really closer to pair programming than it is handing a piece of work to an expert.
The stuff I’ve seen in computer vision is far more impressive in terms of putting people out of a job. But even there, it’s still highly specific models left to churn away at tasks that are ostensibly just long and laborious tasks. Which so much of VFX is.
> we are at the point where models are very close to matching human expert capabilities in many domains
This is not true because experts in these domains don't make the same routine errors LLMs do. You may point to broad benchmarks to prove your point, but actual experts in the benchmarked fields can point to numerous examples of purportedly "expert" LLMs making things up in a way no expert would ever.
Expertise is supposed to mean something -- it's supposed to describe both a level of competency and trustworthiness. Until they can be trusted, calling LLMs experts in anything degrades the meaning of expertise.
The most common part of the S-curve by far is the flat bit before and the flat bit after. We just don't graph it because it's boring. Besides which there is no reason at all to assume that this process will follow that shape. Seems like guesswork backed up by hand waving.
> Just because something exhibits an exponential growth at one point in time, that doesn’t mean that a particular subject is capable of sustaining exponential growth.
Which is pretty ironic given the title of the post
I am constantly astonished that articles like this even pass the smell test. It is not rational to predict exponential growth just because you've seen exponential growth before! Incidentally, that is not what people did during COVID, they predicted exponential growth for reasons. Specific, articulable reasons, that consisted of more than just "look, like go up. line go up more?".
Incidentally, the benchmarks quoted are extremely dubious. They do not even really make sense. "The length of tasks AI can do is doubling every 7 months". Seriously, what does that mean? If the AI suddenly took double the time to answer the same question, that would not be progress. Indeed, that isn't what they did, they just... picked some times at random? You might counter that these are actually human completion times, but then why are we comparing such distinct and unrelated tasks as "count words in a passage" (trivial, any child can do) and "train adversarially robust image model" (expert-level task, could take anywhere between an hour and never-complete).
Honestly, the most hilarious line in the article is probably this one:
> You might object that this plot looks like it might be levelling off, but this is probably mostly an artefact of GPT-5 being very consumer-focused.
This is a plot with three points in it! You might as well be looking at tea leaves!
> but then why are we comparing such distinct and unrelated tasks as ...
Because a few years ago the LLMs could only do trivial tasks that a child could do, and now they're able to do complex research and software development tasks.
If you just have the trivial tasks, the benchmark is saturated within a year. If you just have the very complex tasks, the benchmark is has no sensitivity at all for years (just everything scoring a 0) and then abruptly becomes useful for a brief moment.
This seems pretty obvious, and I can't figure out what your actual concern is. You're just implying it is a flawed design without pointing out anything concrete.
The key word is "unrelated"! Being able to count the number of words in a paragraph and being able to train an image classifier are so different as to be unrelated for all practical purposes. The assumption underlying this kind of a "benchmark" is that all tasks have a certain attribute called complexity which is a numerical value we can use to discriminate tasks, presumably so that if you can complete tasks up to a certain "complexity" then you can complete all other tasks of lower complexity. No such attribute exists! I am sure there are "4 hour" tasks an LLM can do and "5 second" tasks that no LLM can do.
The underlying frustration here is that there is so much latitude possible in choosing which tasks to test, which ones to present, and how to quantify "success" that the metrics given are completely meaningless, and do not help anyone to make a prediction. I would bet my entire life savings that by the time the hype bubble bursts, we will still have 10 brainless articles per day coming out saying AGI is round the corner.
Well put, the metric is cherry picked to further the narrative.
"It’s Difficult to Make Predictions, Especially About the Future" - Yogi Berra. It's funny because it's true.
So if you want to try to do this difficult task, because say there's billions of dollars and millions of people's livelihoods on the line, how do you do it? Gather a bunch of data, and see if there's some trend? Then maybe it makes sense to extrapolate. Seems pretty reasonable to me. Definitely passes the sniff test. Not sure why you think "line go up more" is such a stupid concept.
It's a stupid concept because it's behind every ponzi scheme.
In Ponzi schemes the numbers are generally faked.
As they say, every exponential is a sigmoid in disguise. I think the exponential phase of growth for LLM architectures is drawing to a close, and fundamentally new architectures will be necessary for meaningful advances.
I'm also not convinced by the graphs in this article. OpenAI is notoriously deceptive with their graphs, and as Gary Marcus has already noted, that METR study comes with a lot of caveats: [https://garymarcus.substack.com/p/the-latest-ai-scaling-grap...]
Yes that's logistic growth basically
>People notice that while AI can now write programs, design websites, etc, it still often makes mistakes or goes in a wrong direction, and then they somehow jump to the conclusion that AI will never be able to do these tasks at human levels, or will only have a minor impact. When just a few years ago, having AI do these things was complete science fiction!
Both things can be true, since they're orthogonal.
Having AI do these things was complete fiction 10 years ago. And after 5 years of LLM AI, people do start to see serious limits and stunted growth with the current LLM approaches, while also seeing that nobody has proposed another serious contended to that approach.
Similarly, going to the moon was science finction 100 years ago. And yet, we're now not only not in Mars, but 50+ years without a new moon manned landing. Same for airplanes. Science fiction in 1900. Mostly stale innovation wise for the last 30 years.
A lot of curves can fit an exponential line plot, without the progress going forward being exponential.
We would have 1 trillion transistor cpus following Moore's "exponential curve"
I agree with all your points, just wanted to say that transistor count is probably a counter example. We have been keeping with the Moore's Law more or less[1] and M3 Max, a 2023 consumer-grade CPU, has ~100B of transistors, "just" one order of magnitude away from yout 1T. I think that shows we haven't stagnated much in transistor density and the progress is just staggering!
[1] https://en.m.wikipedia.org/wiki/Transistor_count
That one order of magnitude is about 7 years behind the Moore's Law. We're still progressing but it's slower, more expensive and we hit way more walls than before.
Except it’s not been five years, it’s been at most three, since approximately no one was using LLMs prior to ChatGPT’s release, which was just under three years ago. We did have Copilot a year before that, but it was quite rudimentary.
And really, we’ve had even less than that. The first large scale reasoning model was o1, which was released 12 months ago. More useful coding agents are even newer than that. This narrative that we’ve been using these tools for many years and are now hitting a wall doesn’t match my experience at all. AI-assisted coding is way better than it was a year ago, let alone five.
>Except it’s not been five years, it’s been at most three,
Why would it be "at most" 3? We had Chat GPT commercially available as private beta API on 2020. It's only the mass public that got 3.5 3 years ago.
But those who'd do the noticing as per my argument is not just Joe Public (which could be oblivious), but people already starting in 2020, and includes people working in the space, who worked with LLM and LLM-like architectures 2-3 years before 2020.
No, we didn’t. We had the GPT-3 API available in 2020, and approximately no one was using it.
It should be noted that the article author is an AI researcher at Anthropic and therefore benefits financially from the bubble: https://www.julian.ac/about/
> The current discourse around AI progress and a supposed “bubble” reminds me a lot of the early weeks of the Covid-19 pandemic. Long after the timing and scale of the coming global pandemic was obvious from extrapolating the exponential trends, politicians, journalists and most public commentators kept treating it as a remote possibility or a localized phenomenon.
That's not what I remember. On the contrary, I remember widespread panic. (For some reason, people thought the world was going to run out of toilet paper, which became a self-fulfilling prophesy.) Of course some people were in denial, especially some politicians, though that had everything to do with politics and nothing to do with math and science.
In any case, the public spread of infectious diseases is a relatively well understood phenomenon. I don't see the analogy with some new tech, although the public spread of hype is also a relatively well understood phenomenon.
Exponential curves don't last for long fortunately, or the universe would have turned into a quark soup. The example of COVID is especially ironic, considering it stopped being a real concern within 3 years of its advent despite the exponential growth in the early years.
Those who understand exponentials should also try to understand stock and flow.
Reminds me a bit of the "ultraviolet catastrophe".
> The ultraviolet catastrophe, also called the Rayleigh–Jeans catastrophe, was the prediction of late 19th century and early 20th century classical physics that an ideal black body at thermal equilibrium would emit an unbounded quantity of energy as wavelength decreased into the ultraviolet range.
[...]
> The phrase refers to the fact that the empirically derived Rayleigh–Jeans law, which accurately predicted experimental results at large wavelengths, failed to do so for short wavelengths.
https://en.wikipedia.org/wiki/Ultraviolet_catastrophe
Right. Nobody believed that the intensity would go to infinity. What they believed was that the theory was incomplete, but they didn't know how or why. And the solution required inventing a completely new theory.
Exponentials exist in their environment. Didn't Covid stop because we ran out of people to infect. Of course it can't keep going exponential, because there aren't exponential people to infect.
What is this limit on AI? It is technology, energy, something. All these things can be over-come, to keep the exponential going.
And of course, systems also break at the exponential. Maybe AI is stopped by the world economy collapsing. AI advancement would be stopped, but that is cold comfort to the humans.
> What is this limit on AI?
Gulf money, for one. DoD budget would be another.
Booms are economic phenomena, not technological phenomena. When looking for a limiting factor of a boom, think about the money taps.
>What is this limit on AI?
Data. Think of our LLMs like bacteria in a Petri dish. When first introduced, they achieve exponential growth by rapidly consuming the dish's growth medium. Once the medium is consumed, growth slows and then stops.
The corpus of information on the Internet, produced over several decades, is the LLM's growth medium. And we're not producing new growth medium at an exponential rate.
> What is this limit on AI? It is technology, energy, something. All these things can be over-come, to keep the exponential going.
That's kind of begging the question. Obviously if all the limitations on AI can be overcome growth would be exponential. Even the biggest ai skeptic would agree. The question is, will it?
It's possible to understand both exponential and limiting behavior at the same time. I work in an office full of scientists. Our team scrammed the workplace on March 10, 2020.
To the scientists, it was intuitively obvious that the curve could not surpass 100% of the population. An exponential curve with no turning point is almost always seen as a sure sign that something is wrong with your model. But we didn't have a clue as to the actual limit, and any putative limit below 100% would need a justification, which we didn't have, or some dramatic change to the fundamental conditions, which we couldn't guess.
The typical practice is to watch the curve for any sign of a departure from exponential behavior, and then say: "I told you so." ;-)
The first change may have been social isolation. In fact that was pretty much the only arrow in our quivers. The second change was the vaccine, which changed both the infection rate and the mortality rate, dramatically.
I'm curious as to whether the consensus is that the observed behaviour of COVID waves was ever fully and satisfactorily explained - the tend to grow exponentially but then seemingly saturate at a much lower point than a naïve look at the curve might suggest?
It would probably be hard to do. The really huge factor may be easier to study, since we know where and when every vaccine dose was administered. The behavioral factors are likely to be harder to measure, and would have been masked by the larger effect of vaccination. We don't really know the extent of social isolation over geography, demographics, time, etc..
There's human behavioural factors yes, but I was kinda wondering about the virus itself, the R number seemed to fluctuate quite a bit, with waves peaking fast and early and then receding equally quickly.. I know there were some ideas around asymptomatic spread and superspreaders (both people with highly connected social graphs, and people shedding far more active virus than the median), I just wondered whether anyone had built a model that was considered to have accurately reproduced the observed behaviour of number of positive tests and symptomatic cases, and the way waves would seemingly saturate after infecting a few % of the population.
Long COVID is still a thing, the nAbs immunity is pretty paltry because the virus keeps changing its immunity profile so much. T-cells help but also damage the host because of how COVID overstimulates them. A big reason people aren't dying like they used to is because of the government's strategy of constant infection which boosts immunity regularly* while damaging people each time, that plus how Omicron changed SARS-CoV-2's cell entry mechanism to avoid cell-cell fusion (syncytia) that caused huge over-reaction in lung tissue.
If you think COVID isn't still around: https://www.cdc.gov/nwss/rv/COVID19-national-data.html
* one might call this strategy forced vaccination with a known dangerous live vaccine strain lol
> By the end of 2027, models will frequently outperform experts on many tasks.
In passing the quiz-es
> Models will be able to autonomously work for full days (8 working hours) by mid-2026.
Who will carry responsibility for the consequences of these model's errors? What tools will be avaiable to that resposible _person_?
--
Tehchno optimists will be optimistic. Techno pessimists will be pessimistic.
Processes we're discussing have their own limiting factors which no one mentiones. Why to mention what exactly makes graph go up and holds it from going exponential? Why to mention or discuss inherit limitations of the LLMs architecture? Or what is legal perspective on AI agency?
Thus we're discussing results of AI models passing tests and people's perception of other people opinions.
You don't actually need to have a "responsible person"; you can just have an AI do stuff. It might make a mistake; the only difference between that and an employee is that you can't punish an AI. If you're any good at management and not a psychopath, the ability to have someone to punish for mistakes isn't actually important
The importance of having a human be responsible is about alignment. We have a fundamental belief that human beings are comprehensible and have goals that are not completely opaque. That is not true of any piece of software. In the case of deterministic software, you can’t argue with a bug. It doesn’t matter how many times you tell it that no, that’s not what either the company or the user intended, the result will be the same.
With an AI, the problem is more subtle. The AI may absolutely be able to understand what you’re saying, and may not care at all, because its goals are not your goals, and you can’t tell what its goals are. Having a human be responsible bypasses that. The point is not to punish the AI, the point is to have a hope to stop it from doing things that are harmful.
I will worry when I see Startups competing on products with companies 10x, 100x, or 1000x times their size. Like a small team producing a Photoshop replacement. So far I haven't seen anything like that. Big companies don't seem to be launching new products faster either, or fixing some of their products that have been broken for a long time (MS teams...)
AI obviously makes some easy things much faster, maybe helps with boilerplate, we still have to see this translate into real productivity.
I think the real turning point is when there isn’t the need for something like photoshop. Creatives that I speak to yearn for the day when they can stop paying the adobe tax.
If they don’t like it, they can stop now. It may have consequences, however.
It's interesting that he brings up the example of "exponential" growth in the case of COVID infections even though it was actually logistic growth[1] that saturates once resources get exhausted. What makes AI different?
[1] https://en.wikipedia.org/wiki/Logistic_function#Modeling_ear...
You'd think that boosters for a technology whose very foundations rely on the sigmoid and tanh functions used as neuron activation functions would intuitively get this...
It's all relu these days
When people want a smooth function so they can do calculus they often use something like gelu or the swish function rather than relu. And the swish function involves a sigmoid. https://en.wikipedia.org/wiki/Swish_function
> Again we can observe a similar trend, with the latest GPT-5 already astonishingly close to human performance:
I have issues with "human performance" as single data point in times where education keeps to excel in some countries and degrades in others.
How far away are we from saying, better than "X percent of humans" ?
This reminds me -- very tenuously -- of how the shorthand for very good performance in the Python community is "like C". In the C community, we know that programs have different performance depending on algorithms chosen..
> In the C community, we know that programs have different performance depending on algorithms chosen..
Yes. Only the C community knows this. What a silly remark.
Regarding the "Python community" remark, benchmarks against C and Fortran go back decades now. It's not just a Python thing. C people push it a lot, too.
Nah, that part is ok. Human wherever you set it, human competence takes decades to really change, and those things have visible changes ever year or so.
The problem with all of the article's metrics is that they are all absolutely bullshit. It just throws claims like that AI can write full programs 50% of the time by itself in there and moves on like if it had any resemblance to what happens on the real world.
"Models will be able to autonomously work for full days (8 working hours)" does not make them equivalent to a human employee. My employees go home and come back retaining context from the previous day; they get smarter every month. With Claude Code I have to reset the context between bite-sized tasks.
To replace humans in my workplace, LLMs need some equivalent of neuroplasticity. Maybe it's possible, but it would require some sort of shift in the approach that may or may not be coming.
Maybe when we get updating models. Right now, they are trained, and released, and we are using that static model with a context window. At some point when we have enough processing to have models that are always updating, then that would be plastic. I'm supposing.
> When just a few years ago, having AI do these things was complete science fiction!
This is only because these projects only became consumer facing fairly recently. There was a lot of incremental progress in the academic language model space leading up to this. It wasn't as sudden as this makes it sound.
The deeper issue is that this future-looking analysis goes no deeper than drawing a line connecting a few points. COVID is a really interesting comparison, because in epidemiology the exponential model comes from us understanding disease transmission. It is also not actually exponential, as the population becomes saturated the transmission rate slows (it is worth noting that unbounded exponential growth doesn't really seem to exist in nature). Drawing an exponential line like this doesn't really add anything interesting. When you do a regression you need to pick the model that best represents your system.
This is made even worse because this uses benchmarks and coming up with good benchmarks is actually an important part of the AI problem. AI is really good at improving things we can measure so it makes total sense that it will crush any benchmark we throw at it eventually, but there will always be some difference between benchmarks and reality. I would argue that as you are trying to benchmark more subtle things it becomes much harder to make a benchmark. This is just a conjecture on my end but if something like this is possible it means you need to rule it out when modeling AI progress.
There are also economic incentives to always declare percent increases in progress at a regular schedule.
Will AI ever get this advanced? Maybe, maybe even as fast as the author says, but this just isn't a compelling case for it.
“Guy who personally benefits from AI hype says we aren’t in a bubble” - don’t we have enough of these already ??
(I'm the one who posted the url, not author post)
Julian Schrittwieser (author of this post) has been in AI for a long time, he was in the core team who worked on AlphaGo, AlphaZero and MuZero at DeepMind, you can see him in the AlphaGo movie. While it doesn't make his opinion automatically true, I think it makes it worth considering, especially since he's a technical person, not a CEO trying to raise money
"extrapolating an exponential" seems dubious, but I think the point is more that there is no clear sign of slowing down in models capabilities from the benchmarks, so we can still expect improvements
Benchmarks are notoriously easy to fake. Also he doesn’t need to be a CEO trying to raise money in order to have an incentive here to push this agenda / narrative. He has a huge stock grant from Anthropic that will go to $0 when the bubble pops
I think the author of this blog is not a heavy user of AI in real life. If you are, you know there things AI is very good at, and thing AI is bad at. AI may see exponential improvements in some aspects, but not in other aspects. In the end, those "laggard" aspects of AI will put a ceiling on its real-world performance.
I use AI in my coding for many hours each day. AI is great. But AI will not replace me in 2026 or in 2027. I have to admit I can't make projections many years in the future, because the pace of progress in AI is indeed breathtaking. But, while I am really bullish on AI, I am skeptical of claims that AI will be able to fully replace a human any time soon.
The author is an AI researcher at Anthropic: https://www.julian.ac/about/
He likely has his substantial experience using AI in real life (particularly when it comes to coding).
How much better is AI-assisted coding than it was in September 2023?
This all depends on how you define "better".
I am an amateur programmer and tried to port a python 2.7 library to python 3 with GPT5 a few weeks ago.
After a few tries, I realized both myself and the model missed that a large part of the library is based on another library that was never ported to 3 either.
That doesn't stop GPT5 from trying to write the code as best it can with a library that doesn't exist for python 3.
That is the part we have made absolutely no progress on.
Of course, it can do a much better react crud app than in Sept 2023.
In one sense, LLMs are so amazing and impressive and quite fugazi in another sense.
And today, COVID has infected 5000 quadrillion people!
Failing to understand the sigmoid, again
The sentiment of the comments here seems rather pessimistic. A perspective that balances both sides might be that the rate of mass adoption of some technology often lags behind the frontier capabilities, so I wouldn’t expect AI to take over a majority of those jobs in GPDval in a couple of years, but it’ll probably happen eventually.
There are still fundamental limitations in both the model and products using the model that restrict what AI is capable of, so it’s simultaneously true that AI can do cutting edge work in certain domains for hours while vastly underperforming in other domains for very small tasks. The trajectory of improvement of AI capabilities is also an unknown, where it’s easy to overestimate exponential trends due to unexpected issues arising but also easy to underestimate future innovations.
I don’t see the trajectory slowing down just yet with more compute and larger models being used, and I can imagine AI agents will increasingly give their data to further improve larger models.
Failing to Understand Sigmoid functions, again?
Aside from the S-versus-exp issue, this area is one of these things where there's a kind of disconnect between my personal professional experience with LLMs and the criteria measures he's talking about. LLMs to me have this kind of superficially impressive feel where it seems impressive in its capabilities, but where, when it fails, it fails dramatically, in a way humans never would, and it never gets anywhere near what's necessary to actually be helpful on finishing tasks, beyond being some kind of gestalt template or prototype.
I feel as if there needs to be a lot more scrutiny on the types of evaluation tasks being provided — whether they are actually representative of real-world demands, or if they are making them easy to look good, and also more focus on the types of failures. Looking through some of the evaluation tasks he links to I'm more familiar with, they seem kind of basic? So not achieving parity with human performance is more significant than it seems. I also wonder, in some kind of maxmin sense, whether we need to start focusing more on worst-case failure performance rather than best-case goal performance.
LLMs are really amazing in some sense, and maybe this essay makes some points that are important to keep in mind as possibilities, but my general impression after reading it is it's kind of missing the core substance of AI bubble claims at the moment.
Where in nature/reality do we actually see exponential trends continued long? It seems like they typically encounter a governing effect quite quickly.
The issue is quite quickly can be really slow for humans. Moore’s law has been brought up multiple times.
A lot of this post relies on the recent open ai result they call GDPval (link below). They note some limitations (lack of iteration in the tasks and others) which are key complaints and possibly fundamental limitations of current models.
But more interesting is the 50% win rate stat that represents expert human performance in the paper.
That seems absurdly low, most employees don’t have a 50% success rate on self contained tasks that take ~1 day of work. That means at least one of a few things could be true:
1. The tasks aren’t defined in a way that makes real world sense
2. The tasks require iteration, which wasn’t tested, for real world success (as many tasks do)
I think while interesting and a very worthy research avenue, this paper is only the first in a still early area of understanding how AI will affect with the real world, and it’s hard to project well from this one paper.
https://cdn.openai.com/pdf/d5eb7428-c4e9-4a33-bd86-86dd4bcf1...
That's not 50% success rate at completing the task, that's the win rate of a head-to-head comparison of an algorithm and an expert. 50% means the expert and the algorithm each "win" half the time.
For the METR rating (first half of the article), it is indeed 50% success rate at completing the task. The win rate only applies to the GDPval rating (second half of the article).
To the people who claim that we’re running out of data, I would just say: the world is largely undigitized. The Internet digitized a bunch of words but not even a tiny fraction of all that humans express every day. Same goes for sound in general. CCTV captures a lot of images, far more than social media, but it is poorly processed and also just a fraction of the photons bouncing off objects on earth. The data part of this equation has room to grow.
The 50% success rate is the problem. It means you can’t reliably automate tasks unattended. That seems to be where it becomes non-exponential. It’s like having cars that go twice as far as the last year but will only get you to your destination 50% of the time.
> It’s like having cars that go twice as far as the last year but will only get you to your destination 50% of the time
Nice analogy. All human progress is based on tight-abstractions describing a well-defined machine model. Leaky abstractions with an undefined machine are useful too but only as recommendations or for communication. It is harder to build on top of them. Precisely why programming in english is a non-starter - or - just using english in math/science instead of formalism.
> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.
The difference between exponential and sigmoid is often a surprise to the believers, indeed.
The model (of the world) is not the world.
Just because the model fits so far does not mean it will continue to fit.
These takes (both bears and bulls) are all misguided.
AI agents' performance depends heavily on the context / data / environment provided, and how that fits into the overall business process.
Thus, "agent performance" itself will be very unevenly distributed.
>they somehow jump to the conclusion that AI will never be able to do these tasks at human level
I don’t see that, I mostly see AI criticism that it’s not up to the hype, today. I think most people know it will approach human ability, we just don’t believe the hype that it will be here tomorrow.
I’ve lived through enough AI winter in the past to know that the problem is hard, progress is real and steady, but we could see a big contraction in AI spending in a few years if the bets don’t pay off well in the near term.
The money going into AI right now is huge, but it carries real risks because people want returns on that investment soon, not down the road eventually.
> Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy
Integration into the economy takes time and investment. Unfortunately, ai applications dont have an easy adoption curve - except for the chatbot. Every other use case requires an expensive and risky integration into an existing workflow.
> By the end of 2027, models will frequently outperform experts on many tasks
fixed tasks like tests - maybe. But, the real world is not a fixed model. It requires constant learning through feedback.
Somewhat missed by many comments proclaiming that it’s sigmoidal is that sigmoid curves exhibit significant growth after it stops looking exponential. Unless you think things have already hit a dramatic wall you should probably assume further growth.
We should probably expect compute to get cheaper at the same time, so that’s performance increases with lowering costs. Even after things flatline for performance you would expect lowering costs of inference.
Without specific evidence it’s also unlikely you randomly pick the point on a sigmoid where things change.
There’s no exponential improvement in go or chess agents, or car driving agents. Even tiny mouse racing.
If there is, it would be such nice low hanging fruit.
Maybe all of that happens all at once.
I’d just be honest and say most of it is completely fuzzy tinkering disguised as intellectual activity (yes, some of it is actual intellectual activity and yes we should continue tinkering)
There are rare individuals that spent decades building up good intuition and even that does not help much.
> Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy:
> Models will be able to autonomously work for full days (8 working hours) by mid-2026. At least one model will match the performance of human experts across many industries before the end of 2026.
> By the end of 2027, models will frequently outperform experts on many tasks.
First commandment of tech hype: the pivotal, groundbreaking singularity is always just 1-2 years away.
I mean seriously, why is that? Even when people like OP try to be principled and use seemingly objective evaluation data, they find that the BIG big thing is 1-2 years away.
Self driving cars? 1-2 years away.
AR glasses replacing phones? 1-2 years away.
All of us living our life in the metaverse? 1-2 years away.
Again, I have to commend OP on putting in the work with the serious graphs, but there’s something more at play here.
Is it purely a matter of data cherry picking? Is it the unknowns unknowns leading to the data driven approaches being completely blind to their medium/long term limitations?
Many people seem to assert that "constant relative growth in capabilities/sales/whatever" is a totally reasonable (or even obvious or inevitable) prior assumption, and then point to "OMG relative growth produces an exponential curve!" as the rest of their argument. And at least the AI 2027 people tried to one-up that by asserting an increasing relative growth rate to produce a superexponential curve.
I'd be a fool to say that we'll ever hit a hard plateau in AI capabilities, but I'll have a hard time believing any projected exponential-growth-to-infinity until I see it with my own eyes.
Self driving cars have existed for at least a year now. It only took a decade of “1 years away” but it exists now, and will likely require another decade of scaling up the hardware.
I think AGI is going to follow a similar trend. A decade of being “1 years away”. Meanwhile, unlike self driving the industry is preemptively solving the scaling up of hardware concurrently.
Because I need to specify an amount of time short enough that big investors will hand over a lot of money, long enough that I can extract a big chunk of it for myself before it all comes crashing down.
A couple of years is probably a bit tight, really, but I'm competing for that cash with other people so the timeframe we make up is going to about the lowest we think we can get away with.
I feel like there should be some take away from the fact that we have to come up with new and interesting metrics like “Length of a Task That Can Be Automated” in order to point out that exponential growth is still happening. Fwiw, it does seem like a good metric, but it also feels like you can often find some metric that’s improving exponentially even when the base function is leveling out.
It's the only benchmark I know of with a well-behaved scale. Benchmarks with for example a score from 0-100% get saturated quite quickly, and further improvements on the metric are literally impossible. And even excluding saturation, they just behave very oddly at the extremes. To use them to show long term exponential growth you need to chain together benchmarks, which is hard to make look credible.
From Nassim Taleb
"Unless you have confidence in the ruler’s reliability, if you use a ruler to measure a table you may also be using the table to measure the ruler."
Seems like that is exactly what we are doing.
This doesn't feel at all credible because we're already well into the sigmoid part of the curve. I thought the gpt5 thing made it pretty obvious to everyone.
I'm bullish on AI, I don't think we've even begun to understand the product implications, but the "large language models are in context learners" phase has for now basically played out.
This extrapolates based on a good set of data points to predict when AI will reach significant milestones like being able to “work on tasks for a full 8 hours” (estimates by 2026). Which is ok - but it bears keeping https://xkcd.com/605/ in mind when doing extrapolation.
117 comments so far, and the word economics does not appear.
Any technology which produces more results for more inputs but does not get more efficient at larger scale runs into a money problem if it does not get hit by a physics problem first.
It is quite possible that we have already hit the money problem.
So the author is in a clear conflict of interest with the contents of the blog because he's an employee of Anthropic. But regarding this "blog", showing the graph where OpenAI compares "frontier" models and shows gpt-4o vs o3-high is just disingenuous, o1 vs o3 would have been a closer fight between "frontier" models. Also today I learned that there are people paid to benchmark AI models in terms of how close they are to "human" level, apparently even "expert" level whatever that means. I'm not a LLM hater by any means, but I can confidently say that they aren't experts in any fields.
Even if the computational power evolve exponentially, we need to evaluate the utility of additional computations. And if the utility happens to increase logarithmically with computation spend, it's possible that in the end, we will observe just a linear increase in utility.
AI company employee whose livelihood depends on people continuing to pump money into AI writes a blog post trying to convince people to keep pumping more money into AI. Seems solid.
The "exponential" metric/study they include is pretty atrocious. Measuring AI capability by how long humans would take to do the task. By that definition existing computers are already super AGI - how long would it take humans to sort a list of a million numbers? Computers can do it in a fraction of a second. I guess that proves they're already AGI, right? You could probably fit an exponential curve to that as well, before LLMs even existed.
I'm less concerned about "parity with industry expert" and more concerned about "Error/hallucination rate compared to industry expert".
Without some guarantee of correctness, just posting the # of wins seems vacuous.
I don't think I have ever seen a page on HN where so many people missed the main point.
The phenomenon of people having trouble understanding the implications of exponential progress is really well known. Well known, I think, by many people here.
And yet an alarming number of comments here interpret small pauses as serious trend breakers. False assumptions that we are anywhere near the limits of computing power relative to fundamental physics limits. Etc.
Recent progress, which is unprecedented in speed looking backward, is dismissed because people have acclimatized to change so quickly.
The title of the article "Failing to Understand the Exponential, Again" is far more apt than I could have imagined, on HN.
See my other comments here for specific arguments. See lots of comments here for examples of those who are skeptical of a strong inevitability here.
The "information revolution" started the first time design information was separated from the thing it could construct. I.e. the first DNA or perhaps RNA life. And it has unrelentingly accelerated from there for over 4.5 billion years.
The known physics limits of computation per gram are astronomical. We are nowhere near any hard limit. And that is before any speculation of what could be done with the components of spacetime fragments we don't understand yet. Or physics beyond that.
The information revolution has hardly begun.
With all humor, this was the last place I expected people to not understand how different information technology progresses vs. any other kind. Or to revert to linear based arguments, in an exponentially relevant situation.
If there is any S-curve for information technology in general, it won't be apparent until long after humans are a distant memory.
I'm a little surprised too. A lot of the arguments are along the lines of but LLMs aren't very good. But really LLMs are a brief phase in the information revolution you mention that will be superseded.
To me saying we won't get AGI because LLMs aren't suitable is like saying we were not going to get powered flight because steam engines weren't suitable. Fair enough they weren't but they got modified into internal combustion engines and then were. Something like that will happen.
I didn't plot it, but I had the impression the Aider benchmark success rates for SOTA over time were a hockey curve.
Like the improvements between 60 and 70 felt much faster than those between 80 and 90.
OP failing to understand S-curves again...
I think the first comment on the article put it best: With COVID, researchers could be certain that exponential growth was taking place because they knew the underlying mechanisms of the growth. The virus was self-replicating, so the more people were already infected, the faster would new infections happen.
(Even this dynamic would only go on for a certain time and eventual slow down, forming an S-curve, when the virus could not find any more vulnerable persons to continue the rate of spread. The critical question was of course if this would happen because everyone was vaccinated or isolated enough to prevent infection - or because everyone was already infected or dead)
With AI, there is no such underlying mechanism. There is the dream of the "self-improving AI" where either humans can make use of the current-generation AI to develop the next-generation AI in a fraction of the time - or where the AI simply creates the next generation on its own.
If this dream were reality, it could be genuine exponential growth, but from all I know, it isn't. Coding agents speed up a number of bespoke programming tasks, but they do not exponentially speed up development of new AI models. Yes, we can now quickly generate large corpora of synthetic training data and use them for distillation. We couldn't do that before - but a large part of the training data discussion is about the observation that synthetic data can not replace real data, so data collection remains a bottleneck.
There is one point where a feedback loop does happen, and this is with the hype curve: Initial models produced extremely impressive results compared to everything we had before - there caused an enormous hype and unlocked investments that allowed more resources for the developed of the next model - which then delivered even better results. But it's obvious that this kind of feedback loop will eventually end when no more additional capital is available and diminishing returns set in.
Then we will once again be in the upper part of the S-curve.
Another 'numbrr go up' analyst. Yes, models are objectively better at tasks. Please include the fact that hundreds of billions of dollars are being poured into making them better. You could even call it a technology race. Once the money avalanche runs it's course, I and many others expect 'the exponential' to be followed by an implosion or correction in growth. Data and training is not what LLMs crave. Piles of cash is what LLMs crave.
Good article, the METR metric is very interesting. See also Leopold Aschenbrenner's work in the same vein:
https://situational-awareness.ai/from-gpt-4-to-agi/
IMO this approach ultimately asks the wrong question. Every exponential trend in history has eventually flattened out. Every. single. one. Two rabbits would create a population with a mass greater than the Earth in a couple of years if that trend continues indefinitely. The left hand side of a sigmoid curve looks exactly like exponential growth to the naked eye... until it nears the inflection point at t=0. The two curves can't be distinguished when you only have noisy data from t<0.
A better question is, "When will the curve flatten out?" and that can only be addressed by looking outside the dataset for which constraints will eventually make growth impossible. For example, for Moore's law, we could examine as the quantum limits on how small a single transistor can be. You have to analyze the context, not just do the line fitting exercise.
The only really interesting question in the long term is if it will level off at a level near, below, or above human intelligence. It doesn't matter much if that takes five years or fifty. Simply looking at lines that are currently going up and extending them off the right side of the page doesn't really get us any closer to answering that. We have to look at the fundamental constraints of our understanding and algorithms, independent of hardware. For example, hallucinations may be unsolvable with the current approach and require a genuine paradigm shift to solve, and paradigm shifts don't show up on trend lines, more or less by definition.
There are no exponentials in nature. Everything is finite.
> - Models will be able to autonomously work for full days (8 working hours) by mid-2026. > - At least one model will match the performance of human experts across many industries before the end of 2026. > - By the end of 2027, models will frequently outperform experts on many tasks.
I’ve seen a lot of people make predictions like this and it will be interesting to see how this turns out. But my question is, what should happen to a person’s credibility if their prediction turns out to be wrong? Should the person lose credibility for future predictions and we no longer take them seriously? Or is that way too harsh? Should there be reputational consequences for making bad predictions? I guess this more of a general question, not strictly AI-related.
> Should the person lose credibility for future predictions and we no longer take them seriously
If this were the case, almost every sell-side analyst should have been blacklisted by now. Its more about entertainment than facts - sort of like astrology.
Pure Cope from a partner at Anthropic. However, I _do_ agree AI is comparable to COVID, but not in the way our author intends.
Seems like the right place to ask with ML enthusiasts gathered in one place discussing curves and the things that bend them: what's the thing with potential to obsolete transformers and diffusion models? Is it something old people noticed once LLMs blew up? Something new? Something in-between?
> The evaluation tasks are sourced from experienced industry professionals (avg. 14 years' experience), 30 tasks per occupation for a total of 1320 tasks. Grading is performed by blinded comparison of human and model-generated solutions, allowing for both clear preferences and ties.
It's important to carefully scrutinize the tasks to understand they actually reflect tasks that are unique to industry professionals. I just looked quickly at the nursing ones (my wife is a nurse) and half of them were creating presentations, drafting reports, and the like, which is the primary strength of LLMs but a very small portion of nursing duties.
The computer programming tests are more straightforward. I'd take the other ones with a grain of salt for now.
Measuring "how long" an AI can work for seems bizarre to me.
Its a computer program. What does it even mean that soon it "will be able to work 8 hour days"?
Many of the "people don't understand Exponential functions" posts are ultimately about people not understanding logistic functions. Because most things in reality that seemingly grow exponentially will eventually, unevitably taper off at some point when the cost for continued growth gets so high, accelerated growth can't be supported anymore.
Viruses can only infect so many people for example. If the growth was truly exponential you would need infinite people for it to be truly exponential.
> Again we can observe a similar trend, with the latest GPT-5 already astonishingly close to human performance:
Yes but only if you measure "performance" as "better than the other option more than 50% of the time" which is a terrible way to measure performance, especially for bullshitting AI.
Imagine comparing chocolate brands. One is tastier than the other one 60% of the time. Clear winner right? Yeah except it's also deadly poisonous 5% of the time. Still tastier on average though!
"Failing to Understand the Sigmoid, Again"
This guy isn’t even wrong. Sure these models are getting faster, but they are barely getter better at actual reasoning, if at all. Who cares if a model can give me a bullshit answer in five minutes instead of ten? It’s still bullshit.
Ah, employee of an AI company is telling us the technology he's working on and is directly financially interested in hyping will... grow forever and be amazing and exponential and take over the world. And everyone who doesn't believe this employee of AI company hyping AI is WRONG about basics of math.
I absolutely would NOT ever expect such a blog post.
/s.
That is a complete strawman - you made up forever growth and then argued against it. The OP is saying the in the short term, it makes more sense to assume exponential growth continues instead of thinking it will flatten out any moment now.
Failing to acknowledge we are in a bigger and more dangerous bubble, again.
If AI was so great, why all curl hackerone submissions have been rejected? Slop is not a substitute of skill.