Failing to Understand the Exponential, Again

julian.ac

154 points by lairv 4 days ago


hnlmorg - 4 days ago

Just because something exhibits an exponential growth at one point in time, that doesn’t mean that a particular subject is capable of sustaining exponential growth.

Their Covid example is a great counter argument to their point in that covid isn’t still growing exponentially.

Where the AI skeptics (or even just pragmatists, like myself) chime in is saying “yeah AI will improve. But LLMs are a limited technology that cannot fully bridge the gap between what they’re producing now, and what the “hypists” claim they’ll be able to do in the future.”

People like Sam Altman know ChatGPT is a million miles away from AGI. But their primary goal is to make money. So they have to convince VCs that their technology has a longer period of exponential growth than what it actually will have.

coldtea - 4 days ago

>People notice that while AI can now write programs, design websites, etc, it still often makes mistakes or goes in a wrong direction, and then they somehow jump to the conclusion that AI will never be able to do these tasks at human levels, or will only have a minor impact. When just a few years ago, having AI do these things was complete science fiction!

Both things can be true, since they're orthogonal.

Having AI do these things was complete fiction 10 years ago. And after 5 years of LLM AI, people do start to see serious limits and stunted growth with the current LLM approaches, while also seeing that nobody has proposed another serious contended to that approach.

Similarly, going to the moon was science finction 100 years ago. And yet, we're now not only not in Mars, but 50+ years without a new moon manned landing. Same for airplanes. Science fiction in 1900. Mostly stale innovation wise for the last 30 years.

A lot of curves can fit an exponential line plot, without the progress going forward being exponential.

We would have 1 trillion transistor cpus following Moore's "exponential curve"

crazygringo - 4 days ago

> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.

I'm sure people were saying that about commercial airline speeds in the 1970's too.

But a lot of technologies turn out to be S-shaped, not purely exponential, because there are limiting factors.

With LLM's at the moment, the limiting factors might turn out to be training data, cost, or inherent limits of the transformer approach and the fact that LLM's fundamentally cannot learn outside of their context window. Or a combination of all of these.

The tricky thing about S curves is, you never know where you are on them until the slowdown actually happens. Are we still only in the beginning of the growth part? Or the middle where improvement is linear rather than exponential? And then the growth starts slowing...

konmok - 4 days ago

As they say, every exponential is a sigmoid in disguise. I think the exponential phase of growth for LLM architectures is drawing to a close, and fundamentally new architectures will be necessary for meaningful advances.

I'm also not convinced by the graphs in this article. OpenAI is notoriously deceptive with their graphs, and as Gary Marcus has already noted, that METR study comes with a lot of caveats: [https://garymarcus.substack.com/p/the-latest-ai-scaling-grap...]

HexDecOctBin - 4 days ago

Exponential curves don't last for long fortunately, or the universe would have turned into a quark soup. The example of COVID is especially ironic, considering it stopped being a real concern within 3 years of its advent despite the exponential growth in the early years.

Those who understand exponentials should also try to understand stock and flow.

podgorniy - 4 days ago

> By the end of 2027, models will frequently outperform experts on many tasks.

In passing the quiz-es

> Models will be able to autonomously work for full days (8 working hours) by mid-2026.

Who will carry responsibility for the consequences of these model's errors? What tools will be avaiable to that resposible _person_?

--

Tehchno optimists will be optimistic. Techno pessimists will be pessimistic.

Processes we're discussing have their own limiting factors which no one mentiones. Why to mention what exactly makes graph go up and holds it from going exponential? Why to mention or discuss inherit limitations of the LLMs architecture? Or what is legal perspective on AI agency?

Thus we're discussing results of AI models passing tests and people's perception of other people opinions.

Ambolia - 4 days ago

I will worry when I see Startups competing on products with companies 10x, 100x, or 1000x times their size. Like a small team producing a Photoshop replacement. So far I haven't seen anything like that. Big companies don't seem to be launching new products faster either, or fixing some of their products that have been broken for a long time (MS teams...)

AI obviously makes some easy things much faster, maybe helps with boilerplate, we still have to see this translate into real productivity.

Ravus - 3 days ago

Exponential curves happen when a quantity's growth rate is a linear function of its own value. In practice they're all going to be logistic, but you can ignore that as long as you're far away from the cap of whatever factor limits growth.

So what are the things that could cause "AI growth" (for some suitable definition of it) to be correlated with AI? The plausible ones I see are: - growing AI capabilities spur additional AI capex - AI could be used to develop better AIs

The first one rings true, but is most definitely hitting the limit since US capex into the sector definitely cannot grow 100-fold (and probably cannot grow 4-fold either).

The second one is, to my knowledge, not really a thing.

So unless AI can start improving itself or there is a self-feeding mechanism that I have missed, we're near the logistic fun phase.

j_maffe - 4 days ago

It's interesting that he brings up the example of "exponential" growth in the case of COVID infections even though it was actually logistic growth[1] that saturates once resources get exhausted. What makes AI different?

[1] https://en.wikipedia.org/wiki/Logistic_function#Modeling_ear...

noiv - 4 days ago

> Again we can observe a similar trend, with the latest GPT-5 already astonishingly close to human performance:

I have issues with "human performance" as single data point in times where education keeps to excel in some countries and degrades in others.

How far away are we from saying, better than "X percent of humans" ?

entee - 4 days ago

A lot of this post relies on the recent open ai result they call GDPval (link below). They note some limitations (lack of iteration in the tasks and others) which are key complaints and possibly fundamental limitations of current models.

But more interesting is the 50% win rate stat that represents expert human performance in the paper.

That seems absurdly low, most employees don’t have a 50% success rate on self contained tasks that take ~1 day of work. That means at least one of a few things could be true:

1. The tasks aren’t defined in a way that makes real world sense

2. The tasks require iteration, which wasn’t tested, for real world success (as many tasks do)

I think while interesting and a very worthy research avenue, this paper is only the first in a still early area of understanding how AI will affect with the real world, and it’s hard to project well from this one paper.

https://cdn.openai.com/pdf/d5eb7428-c4e9-4a33-bd86-86dd4bcf1...

Earw0rm - 4 days ago

You'd think that boosters for a technology whose very foundations rely on the sigmoid and tanh functions used as neuron activation functions would intuitively get this...

poopiokaka - 4 days ago

“Guy who personally benefits from AI hype says we aren’t in a bubble” - don’t we have enough of these already ??

stickfigure - 4 days ago

"Models will be able to autonomously work for full days (8 working hours)" does not make them equivalent to a human employee. My employees go home and come back retaining context from the previous day; they get smarter every month. With Claude Code I have to reset the context between bite-sized tasks.

To replace humans in my workplace, LLMs need some equivalent of neuroplasticity. Maybe it's possible, but it would require some sort of shift in the approach that may or may not be coming.

morsecodist - 4 days ago

> When just a few years ago, having AI do these things was complete science fiction!

This is only because these projects only became consumer facing fairly recently. There was a lot of incremental progress in the academic language model space leading up to this. It wasn't as sudden as this makes it sound.

The deeper issue is that this future-looking analysis goes no deeper than drawing a line connecting a few points. COVID is a really interesting comparison, because in epidemiology the exponential model comes from us understanding disease transmission. It is also not actually exponential, as the population becomes saturated the transmission rate slows (it is worth noting that unbounded exponential growth doesn't really seem to exist in nature). Drawing an exponential line like this doesn't really add anything interesting. When you do a regression you need to pick the model that best represents your system.

This is made even worse because this uses benchmarks and coming up with good benchmarks is actually an important part of the AI problem. AI is really good at improving things we can measure so it makes total sense that it will crush any benchmark we throw at it eventually, but there will always be some difference between benchmarks and reality. I would argue that as you are trying to benchmark more subtle things it becomes much harder to make a benchmark. This is just a conjecture on my end but if something like this is possible it means you need to rule it out when modeling AI progress.

There are also economic incentives to always declare percent increases in progress at a regular schedule.

Will AI ever get this advanced? Maybe, maybe even as fast as the author says, but this just isn't a compelling case for it.

arthurofbabylon - 3 days ago

I am flabbergasted by the naivety around predicting the future. While we have hints and suggestions, our predictions are best expressed as ranges of possibilities with varying weights. The hyperbolic among us like to pretend that predictions come in the form of precise lines of predetermined direction and curve; how foolish!

Predicting exponential growth is exceptionally difficult. Asymptotes are ordinary, and they often are not obvious until circumstances make them appear (in other words, they are commonly unpredictable).

(I do agree with the author regarding the potential of LLM's remaining underestimated by much of the public, however I cannot hang around such abysmal reasoning.)

derbOac - 4 days ago

Aside from the S-versus-exp issue, this area is one of these things where there's a kind of disconnect between my personal professional experience with LLMs and the criteria measures he's talking about. LLMs to me have this kind of superficially impressive feel where it seems impressive in its capabilities, but where, when it fails, it fails dramatically, in a way humans never would, and it never gets anywhere near what's necessary to actually be helpful on finishing tasks, beyond being some kind of gestalt template or prototype.

I feel as if there needs to be a lot more scrutiny on the types of evaluation tasks being provided — whether they are actually representative of real-world demands, or if they are making them easy to look good, and also more focus on the types of failures. Looking through some of the evaluation tasks he links to I'm more familiar with, they seem kind of basic? So not achieving parity with human performance is more significant than it seems. I also wonder, in some kind of maxmin sense, whether we need to start focusing more on worst-case failure performance rather than best-case goal performance.

LLMs are really amazing in some sense, and maybe this essay makes some points that are important to keep in mind as possibilities, but my general impression after reading it is it's kind of missing the core substance of AI bubble claims at the moment.

airstrike - 4 days ago

Failing to understand the sigmoid, again

doug_durham - 4 days ago

The 50% success rate is the problem. It means you can’t reliably automate tasks unattended. That seems to be where it becomes non-exponential. It’s like having cars that go twice as far as the last year but will only get you to your destination 50% of the time.

credit_guy - 4 days ago

I think the author of this blog is not a heavy user of AI in real life. If you are, you know there things AI is very good at, and thing AI is bad at. AI may see exponential improvements in some aspects, but not in other aspects. In the end, those "laggard" aspects of AI will put a ceiling on its real-world performance.

I use AI in my coding for many hours each day. AI is great. But AI will not replace me in 2026 or in 2027. I have to admit I can't make projections many years in the future, because the pace of progress in AI is indeed breathtaking. But, while I am really bullish on AI, I am skeptical of claims that AI will be able to fully replace a human any time soon.

Isamu - 4 days ago

>they somehow jump to the conclusion that AI will never be able to do these tasks at human level

I don’t see that, I mostly see AI criticism that it’s not up to the hype, today. I think most people know it will approach human ability, we just don’t believe the hype that it will be here tomorrow.

I’ve lived through enough AI winter in the past to know that the problem is hard, progress is real and steady, but we could see a big contraction in AI spending in a few years if the bets don’t pay off well in the near term.

The money going into AI right now is huge, but it carries real risks because people want returns on that investment soon, not down the road eventually.

littlestymaar - 3 days ago

It's funny because the author doesn't realize that this sentence at the beginning undermines his entire argument:

> Or they see two consecutive model releases and don’t notice much difference in their conversations, and they conclude that AI is plateauing and scaling is over.

The reason why we now fail to notice the difference between consecutive models now is because the progress isn't in fact exponential. Humans tend to have a logarithmic perception, which means we only appreciate progress when it is exponential (for instance you'd be very happy to get a $500 raise if you are living the minimum wage, but you wouldn't even call that “a raise” when on a SV engineers salary).

AI models have been improving a ton for the past three years, in many direction, but the rate of progress is definitely not exponential. It's not emergent either, as the focus is now being specifically directed at solving specific problems (both riddles and real world problems) thanks to trillions of token of high quality synthetic data.

On topics that aren't explicitly being worked on, progress have been minimal or even negative (for instance many people still use the 1 year old Mistral Nemo for creative writing because the more recent ones all have been STEMmaxxed)

bwfan123 - 4 days ago

> Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy

Integration into the economy takes time and investment. Unfortunately, ai applications dont have an easy adoption curve - except for the chatbot. Every other use case requires an expensive and risky integration into an existing workflow.

> By the end of 2027, models will frequently outperform experts on many tasks

fixed tasks like tests - maybe. But, the real world is not a fixed model. It requires constant learning through feedback.

kaashif - 4 days ago

And today, COVID has infected 5000 quadrillion people!

conartist6 - 3 days ago

Wow, an exponential trendline, I guess billions of years of evolution can just give up and go home cause we have rigged the game my friends. At this rate we will create an AI which can do a task 10 years long! And then soon after that 100 years long! And that's that. Humans will be kept as pets because that's all we will be good for QED

atoav - 4 days ago

Many of the "people don't understand Exponential functions" posts are ultimately about people not understanding logistic functions. Because most things in reality that seemingly grow exponentially will eventually, unevitably taper off at some point when the cost for continued growth gets so high, accelerated growth can't be supported anymore.

Viruses can only infect so many people for example. If the growth was truly exponential you would need infinite people for it to be truly exponential.

IshKebab - 4 days ago

> Again we can observe a similar trend, with the latest GPT-5 already astonishingly close to human performance:

Yes but only if you measure "performance" as "better than the other option more than 50% of the time" which is a terrible way to measure performance, especially for bullshitting AI.

Imagine comparing chocolate brands. One is tastier than the other one 60% of the time. Clear winner right? Yeah except it's also deadly poisonous 5% of the time. Still tastier on average though!

_fizz_buzz_ - 4 days ago

Failing to Understand Sigmoid functions, again?

gyomu - 4 days ago

> Instead, even a relatively conservative extrapolation of these trends suggests that 2026 will be a pivotal year for the widespread integration of AI into the economy:

> Models will be able to autonomously work for full days (8 working hours) by mid-2026. At least one model will match the performance of human experts across many industries before the end of 2026.

> By the end of 2027, models will frequently outperform experts on many tasks.

First commandment of tech hype: the pivotal, groundbreaking singularity is always just 1-2 years away.

I mean seriously, why is that? Even when people like OP try to be principled and use seemingly objective evaluation data, they find that the BIG big thing is 1-2 years away.

Self driving cars? 1-2 years away.

AR glasses replacing phones? 1-2 years away.

All of us living our life in the metaverse? 1-2 years away.

Again, I have to commend OP on putting in the work with the serious graphs, but there’s something more at play here.

Is it purely a matter of data cherry picking? Is it the unknowns unknowns leading to the data driven approaches being completely blind to their medium/long term limitations?

ctrlp - 4 days ago

Where in nature/reality do we actually see exponential trends continued long? It seems like they typically encounter a governing effect quite quickly.

slaucon - 4 days ago

I feel like there should be some take away from the fact that we have to come up with new and interesting metrics like “Length of a Task That Can Be Automated” in order to point out that exponential growth is still happening. Fwiw, it does seem like a good metric, but it also feels like you can often find some metric that’s improving exponentially even when the base function is leveling out.

alyxya - 4 days ago

The sentiment of the comments here seems rather pessimistic. A perspective that balances both sides might be that the rate of mass adoption of some technology often lags behind the frontier capabilities, so I wouldn’t expect AI to take over a majority of those jobs in GPDval in a couple of years, but it’ll probably happen eventually.

There are still fundamental limitations in both the model and products using the model that restrict what AI is capable of, so it’s simultaneously true that AI can do cutting edge work in certain domains for hours while vastly underperforming in other domains for very small tasks. The trajectory of improvement of AI capabilities is also an unknown, where it’s easy to overestimate exponential trends due to unexpected issues arising but also easy to underestimate future innovations.

I don’t see the trajectory slowing down just yet with more compute and larger models being used, and I can imagine AI agents will increasingly give their data to further improve larger models.

andy99 - 4 days ago

This doesn't feel at all credible because we're already well into the sigmoid part of the curve. I thought the gpt5 thing made it pretty obvious to everyone.

I'm bullish on AI, I don't think we've even begun to understand the product implications, but the "large language models are in context learners" phase has for now basically played out.

justlikereddit - 3 days ago

The exponential progress argument is frequently also misconstrued as a

>"we will get there by monotonously doing more of what we did previously"

Take the independent time being an SWE metric of the article. This is a rather new( metric for measuring AI capabilitie, it's also a good metric, it is directly measurable in a quantified way, unlike nebulous goal points such as "AGI/ASI".

It also doesn't necessarily predict any upheaval, which I also think is a good trait of a metric, we know it will be better when it hits 8, or 16 hours, but we can skip the hype and prophecies of civilizational transformation that are attached to terminology like "AGI/ASI".

Now the caveat is that a SWE-time metric is useful at the moment because it's an intra day timescale, but if we push this number to the point of comparing 48 hour vs 54 hour SWE-time models we can easily end up chasing abstractions that have little to no explanatory power as to how good this AI really is and what consists as a proper and good incremental improvement and what comes out as a numerical benchmark number that may or may not be artificial.

The same can be said of math-olympiad scores and many of the existing AI benchmarks.

In the past there existed a concept of narrow AI. We could take task A, make a narrow AI become good at it. But we would expect a different application to be needed for task B.

Now we have generalist AI, and we take the generalist AI and make it become good at task A because that is the flavor of the month metric, but maybe that doesn't translate for improving task B, which someone will come around to improving when that becomes flavor of the month.

The conclusion? There's probably no good singular metric to get stuck on and say

"this is it, this graph is the one, watch it go exponential and bring forth God"

We will instead skip, hop and jump between task-or-category specific metrics that are deemed significant at the moment and arms-race style pump them up until their relevance fades.

insane_dreamer - 3 days ago

> The evaluation tasks are sourced from experienced industry professionals (avg. 14 years' experience), 30 tasks per occupation for a total of 1320 tasks. Grading is performed by blinded comparison of human and model-generated solutions, allowing for both clear preferences and ties.

It's important to carefully scrutinize the tasks to understand they actually reflect tasks that are unique to industry professionals. I just looked quickly at the nursing ones (my wife is a nurse) and half of them were creating presentations, drafting reports, and the like, which is the primary strength of LLMs but a very small portion of nursing duties.

The computer programming tests are more straightforward. I'd take the other ones with a grain of salt for now.

thegrim33 - 4 days ago

AI company employee whose livelihood depends on people continuing to pump money into AI writes a blog post trying to convince people to keep pumping more money into AI. Seems solid.

The "exponential" metric/study they include is pretty atrocious. Measuring AI capability by how long humans would take to do the task. By that definition existing computers are already super AGI - how long would it take humans to sort a list of a million numbers? Computers can do it in a fraction of a second. I guess that proves they're already AGI, right? You could probably fit an exponential curve to that as well, before LLMs even existed.

pyrale - 4 days ago

> Given consistent trends of exponential performance improvements over many years and across many industries, it would be extremely surprising if these improvements suddenly stopped.

The difference between exponential and sigmoid is often a surprise to the believers, indeed.

silvestrov - 4 days ago

The model (of the world) is not the world.

Just because the model fits so far does not mean it will continue to fit.

nextworddev - 4 days ago

These takes (both bears and bulls) are all misguided.

AI agents' performance depends heavily on the context / data / environment provided, and how that fits into the overall business process.

Thus, "agent performance" itself will be very unevenly distributed.

zdw - 4 days ago

I'm less concerned about "parity with industry expert" and more concerned about "Error/hallucination rate compared to industry expert".

Without some guarantee of correctness, just posting the # of wins seems vacuous.

IanCal - 4 days ago

Somewhat missed by many comments proclaiming that it’s sigmoidal is that sigmoid curves exhibit significant growth after it stops looking exponential. Unless you think things have already hit a dramatic wall you should probably assume further growth.

We should probably expect compute to get cheaper at the same time, so that’s performance increases with lowering costs. Even after things flatline for performance you would expect lowering costs of inference.

Without specific evidence it’s also unlikely you randomly pick the point on a sigmoid where things change.

vonnik - 3 days ago

To the people who claim that we’re running out of data, I would just say: the world is largely undigitized. The Internet digitized a bunch of words but not even a tiny fraction of all that humans express every day. Same goes for sound in general. CCTV captures a lot of images, far more than social media, but it is poorly processed and also just a fraction of the photons bouncing off objects on earth. The data part of this equation has room to grow.

vjerancrnjak - 4 days ago

There’s no exponential improvement in go or chess agents, or car driving agents. Even tiny mouse racing.

If there is, it would be such nice low hanging fruit.

Maybe all of that happens all at once.

I’d just be honest and say most of it is completely fuzzy tinkering disguised as intellectual activity (yes, some of it is actual intellectual activity and yes we should continue tinkering)

There are rare individuals that spent decades building up good intuition and even that does not help much.

loloquwowndueo - 4 days ago

This extrapolates based on a good set of data points to predict when AI will reach significant milestones like being able to “work on tasks for a full 8 hours” (estimates by 2026). Which is ok - but it bears keeping https://xkcd.com/605/ in mind when doing extrapolation.

Kye - 3 days ago

Seems like the right place to ask with ML enthusiasts gathered in one place discussing curves and the things that bend them: what's the thing with potential to obsolete transformers and diffusion models? Is it something old people noticed once LLMs blew up? Something new? Something in-between?

chad1n - 4 days ago

So the author is in a clear conflict of interest with the contents of the blog because he's an employee of Anthropic. But regarding this "blog", showing the graph where OpenAI compares "frontier" models and shows gpt-4o vs o3-high is just disingenuous, o1 vs o3 would have been a closer fight between "frontier" models. Also today I learned that there are people paid to benchmark AI models in terms of how close they are to "human" level, apparently even "expert" level whatever that means. I'm not a LLM hater by any means, but I can confidently say that they aren't experts in any fields.

Rexxar - 4 days ago

Even if the computational power evolve exponentially, we need to evaluate the utility of additional computations. And if the utility happens to increase logarithmically with computation spend, it's possible that in the end, we will observe just a linear increase in utility.

dsr_ - 4 days ago

117 comments so far, and the word economics does not appear.

Any technology which produces more results for more inputs but does not get more efficient at larger scale runs into a money problem if it does not get hit by a physics problem first.

It is quite possible that we have already hit the money problem.

11101010001100 - 3 days ago

Well, the article is not off to a good start. COVID-19 is modeled by an SIR dynamical system, which is at times, is approximately exponential.

At times.

k__ - 4 days ago

I didn't plot it, but I had the impression the Aider benchmark success rates for SOTA over time were a hockey curve.

Like the improvements between 60 and 70 felt much faster than those between 80 and 90.

YeGoblynQueenne - 2 days ago

>> "Train adversarially robust image model".

Should be easy to check a couple years down the line.

lapcat - 4 days ago

It should be noted that the article author is an AI researcher at Anthropic and therefore benefits financially from the bubble: https://www.julian.ac/about/

> The current discourse around AI progress and a supposed “bubble” reminds me a lot of the early weeks of the Covid-19 pandemic. Long after the timing and scale of the coming global pandemic was obvious from extrapolating the exponential trends, politicians, journalists and most public commentators kept treating it as a remote possibility or a localized phenomenon.

That's not what I remember. On the contrary, I remember widespread panic. (For some reason, people thought the world was going to run out of toilet paper, which became a self-fulfilling prophesy.) Of course some people were in denial, especially some politicians, though that had everything to do with politics and nothing to do with math and science.

In any case, the public spread of infectious diseases is a relatively well understood phenomenon. I don't see the analogy with some new tech, although the public spread of hype is also a relatively well understood phenomenon.

Nevermark - 4 days ago

I don't think I have ever seen a page on HN where so many people missed the main point.

The phenomenon of people having trouble understanding the implications of exponential progress is really well known. Well known, I think, by many people here.

And yet an alarming number of comments here interpret small pauses as serious trend breakers. False assumptions that we are anywhere near the limits of computing power relative to fundamental physics limits. Etc.

Recent progress, which is unprecedented in speed looking backward, is dismissed because people have acclimatized to change so quickly.

The title of the article "Failing to Understand the Exponential, Again" is far more apt than I could have imagined, on HN.

See my other comments here for specific arguments. See lots of comments here for examples of those who are skeptical of a strong inevitability here.

The "information revolution" started the first time design information was separated from the thing it could construct. I.e. the first DNA or perhaps RNA life. And it has unrelentingly accelerated from there for over 4.5 billion years.

The known physics limits of computation per gram are astronomical. We are nowhere near any hard limit. And that is before any speculation of what could be done with the components of spacetime fragments we don't understand yet. Or physics beyond that.

The information revolution has hardly begun.

With all humor, this was the last place I expected people to not understand how different information technology progresses vs. any other kind. Or to revert to linear based arguments, in an exponentially relevant situation.

If there is any S-curve for information technology in general, it won't be apparent until long after humans are a distant memory.

xg15 - 4 days ago

OP failing to understand S-curves again...

I think the first comment on the article put it best: With COVID, researchers could be certain that exponential growth was taking place because they knew the underlying mechanisms of the growth. The virus was self-replicating, so the more people were already infected, the faster would new infections happen.

(Even this dynamic would only go on for a certain time and eventual slow down, forming an S-curve, when the virus could not find any more vulnerable persons to continue the rate of spread. The critical question was of course if this would happen because everyone was vaccinated or isolated enough to prevent infection - or because everyone was already infected or dead)

With AI, there is no such underlying mechanism. There is the dream of the "self-improving AI" where either humans can make use of the current-generation AI to develop the next-generation AI in a fraction of the time - or where the AI simply creates the next generation on its own.

If this dream were reality, it could be genuine exponential growth, but from all I know, it isn't. Coding agents speed up a number of bespoke programming tasks, but they do not exponentially speed up development of new AI models. Yes, we can now quickly generate large corpora of synthetic training data and use them for distillation. We couldn't do that before - but a large part of the training data discussion is about the observation that synthetic data can not replace real data, so data collection remains a bottleneck.

There is one point where a feedback loop does happen, and this is with the hype curve: Initial models produced extremely impressive results compared to everything we had before - there caused an enormous hype and unlocked investments that allowed more resources for the developed of the next model - which then delivered even better results. But it's obvious that this kind of feedback loop will eventually end when no more additional capital is available and diminishing returns set in.

Then we will once again be in the upper part of the S-curve.

joshdavham - 4 days ago

> - Models will be able to autonomously work for full days (8 working hours) by mid-2026. > - At least one model will match the performance of human experts across many industries before the end of 2026. > - By the end of 2027, models will frequently outperform experts on many tasks.

I’ve seen a lot of people make predictions like this and it will be interesting to see how this turns out. But my question is, what should happen to a person’s credibility if their prediction turns out to be wrong? Should the person lose credibility for future predictions and we no longer take them seriously? Or is that way too harsh? Should there be reputational consequences for making bad predictions? I guess this more of a general question, not strictly AI-related.

gettingoverit - 2 days ago

On top of other criticism here, I'd like to add that the article optimistically assumes that actors are completely honest with their benchmarks when billions of dollars and national security are at stake.

I'm only an "expert" in computer science and software engineering, and can say that - neither of widely available LLMs can produce answers at the level of first year CS student; - students using LLMs can easily be distingished by being wrong in all the ways a human would otherwise never be.

So to me it's not really the question of whether CS-related benchmarks are false, it's a question of how exactly did this BS even fly.

Obviously in other disciplines LLMs show similar lack of performance, but I can't call myself an "expert" there, and someone might argue I tend to use wrong prompts.

Until we see a website where we can put an intermediate problem and get a working solution, "benchmarks show that our AI solves problems on gold medalist level" will still be an obvious BS.

monkeyelite - 3 days ago

All of those never-ending exponential graphs about Covid were wrong though.

1970-01-01 - 4 days ago

Another 'numbrr go up' analyst. Yes, models are objectively better at tasks. Please include the fact that hundreds of billions of dollars are being poured into making them better. You could even call it a technology race. Once the money avalanche runs it's course, I and many others expect 'the exponential' to be followed by an implosion or correction in growth. Data and training is not what LLMs crave. Piles of cash is what LLMs crave.

olooney - 4 days ago

Good article, the METR metric is very interesting. See also Leopold Aschenbrenner's work in the same vein:

https://situational-awareness.ai/from-gpt-4-to-agi/

IMO this approach ultimately asks the wrong question. Every exponential trend in history has eventually flattened out. Every. single. one. Two rabbits would create a population with a mass greater than the Earth in a couple of years if that trend continues indefinitely. The left hand side of a sigmoid curve looks exactly like exponential growth to the naked eye... until it nears the inflection point at t=0. The two curves can't be distinguished when you only have noisy data from t<0.

A better question is, "When will the curve flatten out?" and that can only be addressed by looking outside the dataset for which constraints will eventually make growth impossible. For example, for Moore's law, we could examine as the quantum limits on how small a single transistor can be. You have to analyze the context, not just do the line fitting exercise.

The only really interesting question in the long term is if it will level off at a level near, below, or above human intelligence. It doesn't matter much if that takes five years or fifty. Simply looking at lines that are currently going up and extending them off the right side of the page doesn't really get us any closer to answering that. We have to look at the fundamental constraints of our understanding and algorithms, independent of hardware. For example, hallucinations may be unsolvable with the current approach and require a genuine paradigm shift to solve, and paradigm shifts don't show up on trend lines, more or less by definition.

whatever1 - 4 days ago

There are no exponentials in nature. Everything is finite.

- 4 days ago
[deleted]
danlitt - 4 days ago

I am constantly astonished that articles like this even pass the smell test. It is not rational to predict exponential growth just because you've seen exponential growth before! Incidentally, that is not what people did during COVID, they predicted exponential growth for reasons. Specific, articulable reasons, that consisted of more than just "look, like go up. line go up more?".

Incidentally, the benchmarks quoted are extremely dubious. They do not even really make sense. "The length of tasks AI can do is doubling every 7 months". Seriously, what does that mean? If the AI suddenly took double the time to answer the same question, that would not be progress. Indeed, that isn't what they did, they just... picked some times at random? You might counter that these are actually human completion times, but then why are we comparing such distinct and unrelated tasks as "count words in a passage" (trivial, any child can do) and "train adversarially robust image model" (expert-level task, could take anywhere between an hour and never-complete).

Honestly, the most hilarious line in the article is probably this one:

> You might object that this plot looks like it might be levelling off, but this is probably mostly an artefact of GPT-5 being very consumer-focused.

This is a plot with three points in it! You might as well be looking at tea leaves!

mwkaufma - 4 days ago

Pure Cope from a partner at Anthropic. However, I _do_ agree AI is comparable to COVID, but not in the way our author intends.

BenFranklin100 - 4 days ago

This guy isn’t even wrong. Sure these models are getting faster, but they are barely getter better at actual reasoning, if at all. Who cares if a model can give me a bullshit answer in five minutes instead of ten? It’s still bullshit.

bawolff - 4 days ago

Measuring "how long" an AI can work for seems bizarre to me.

Its a computer program. What does it even mean that soon it "will be able to work 8 hour days"?

phyzome - 4 days ago

"Failing to Understand the Sigmoid, Again"

- 4 days ago
[deleted]
i420army - 3 days ago

[dead]

ath3nd - 4 days ago

Failing to acknowledge we are in a bigger and more dangerous bubble, again.

If AI was so great, why all curl hackerone submissions have been rejected? Slop is not a substitute of skill.

izacus - 4 days ago

Ah, employee of an AI company is telling us the technology he's working on and is directly financially interested in hyping will... grow forever and be amazing and exponential and take over the world. And everyone who doesn't believe this employee of AI company hyping AI is WRONG about basics of math.

I absolutely would NOT ever expect such a blog post.

/s.