The wall confronting large language models
- URL: http://arxiv.org/abs/2507.19703v2
- Date: Wed, 30 Jul 2025 07:58:56 GMT
- Title: The wall confronting large language models
- Authors: Peter V. Coveney, Sauro Succi,
- Abstract summary: We show that the scaling laws which determine the performance of large language models severely limit their ability to improve the uncertainty of their predictions.<n>We argue that the very mechanism which fuels much of the learning power of LLMs might well be at the roots of their propensity to produce error pileup.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We show that the scaling laws which determine the performance of large language models (LLMs) severely limit their ability to improve the uncertainty of their predictions. As a result, raising their reliability to meet the standards of scientific inquiry is intractable by any reasonable measure. We argue that the very mechanism which fuels much of the learning power of LLMs, namely the ability to generate non-Gaussian output distributions from Gaussian input ones, might well be at the roots of their propensity to produce error pileup, ensuing information catastrophes and degenerative AI behaviour. This tension between learning and accuracy is a likely candidate mechanism underlying the observed low values of the scaling components. It is substantially compounded by the deluge of spurious correlations pointed out by Calude and Longo which rapidly increase in any data set merely as a function of its size, regardless of its nature. The fact that a degenerative AI pathway is a very probable feature of the LLM landscape does not mean that it must inevitably arise in all future AI research. Its avoidance, which we also discuss in this paper, necessitates putting a much higher premium on insight and understanding of the structural characteristics of the problems being investigated.
Related papers
- Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning [26.3914014514629]
In scientific domains -- from biology to the social sciences -- many questions boil down to textitWhat effect will we observe if we intervene on a particular variable?<n>We propose using meta-learning to create an end-to-end model: the Model-Averaged Causal Estimation Transformer Neural Process (MACE-TNP)<n>Our work establishes meta-learning as a flexible and scalable paradigm for approximating complex Bayesian causal inference.
arXiv Detail & Related papers (2025-07-07T22:48:32Z) - Do Larger Language Models Imply Better Generalization? A Pretraining Scaling Law for Implicit Reasoning [89.17086632436363]
We introduce a synthetic multihop reasoning environment designed to replicate the structure and distribution of real-world large-scale knowledge graphs.<n>Our reasoning task involves completing missing edges in the graph, which requires advanced multi-hop reasoning and mimics real-world reasoning scenarios.<n>To predict the optimal model size for a specific knowledge graph, we find an empirical scaling that linearly maps the knowledge graph search entropy to the optimal model size.
arXiv Detail & Related papers (2025-04-04T17:57:22Z) - Emergent Abilities in Large Language Models: A Survey [9.50669909278749]
Large Language Models (LLMs) are leading a new technological revolution as one of the most promising research streams toward artificial general intelligence.<n>The scaling of these models has been linked to various so-called emergent abilities that were previously unobserved.<n>These abilities range from advanced reasoning and in-context learning to coding and problem-solving.<n>Despite their transformative potential, emergent abilities remain poorly understood, leading to misconceptions about their definition, nature, predictability, and implications.
arXiv Detail & Related papers (2025-02-28T01:20:01Z) - Mechanism learning: Reverse causal inference in the presence of multiple unknown confounding through front-door causal bootstrapping [0.8901073744693314]
A major limitation of machine learning (ML) prediction models is that they recover associational, rather than causal, predictive relationships between variables.
This paper proposes mechanism learning, a simple method which uses front-door causal bootstrapping to deconfound observational data.
We test our method on fully synthetic, semi-synthetic and real-world datasets, demonstrating that it can discover reliable, unbiased, causal ML predictors.
arXiv Detail & Related papers (2024-10-26T03:34:55Z) - Language Model Cascades: Token-level uncertainty and beyond [65.38515344964647]
Recent advances in language models (LMs) have led to significant improvements in quality on complex NLP tasks.
Cascading offers a simple strategy to achieve more favorable cost-quality tradeoffs.
We show that incorporating token-level uncertainty through learned post-hoc deferral rules can significantly outperform simple aggregation strategies.
arXiv Detail & Related papers (2024-04-15T21:02:48Z) - Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification [116.77055746066375]
Large language models (LLMs) are notorious for hallucinating, i.e., producing erroneous claims in their output.
We propose a novel fact-checking and hallucination detection pipeline based on token-level uncertainty quantification.
arXiv Detail & Related papers (2024-03-07T17:44:17Z) - Mitigating Prior Errors in Causal Structure Learning: Towards LLM driven
Prior Knowledge [17.634793921251777]
We aim to tackle erroneous prior causal statements from Large Language Models (LLM)
As a pioneer attempt, we propose a BN learning strategy resilient to prior errors without need of human intervention.
Specifically, we highlight its substantial ability to resist order-reversed errors while maintaining the majority of correct prior knowledge.
arXiv Detail & Related papers (2023-06-12T11:24:48Z) - Principled Knowledge Extrapolation with GANs [92.62635018136476]
We study counterfactual synthesis from a new perspective of knowledge extrapolation.
We show that an adversarial game with a closed-form discriminator can be used to address the knowledge extrapolation problem.
Our method enjoys both elegant theoretical guarantees and superior performance in many scenarios.
arXiv Detail & Related papers (2022-05-21T08:39:42Z) - Evaluating Distributional Distortion in Neural Language Modeling [81.83408583979745]
A heavy-tail of rare events accounts for a significant amount of the total probability mass of distributions in language.
Standard language modeling metrics such as perplexity quantify the performance of language models (LM) in aggregate.
We develop a controlled evaluation scheme which uses generative models trained on natural data as artificial languages.
arXiv Detail & Related papers (2022-03-24T01:09:46Z) - Systematic Evaluation of Causal Discovery in Visual Model Based
Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure.
Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs.
In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.