Related papers: On the Limits of Self-Improving in LLMs and Why AGI, ASI and the Singularity Are Not Near Without Symbolic Model Synthesis

On the Limits of Self-Improving in LLMs and Why AGI, ASI and the Singularity Are Not Near Without Symbolic Model Synthesis

URL: http://arxiv.org/abs/2601.05280v1
Date: Mon, 05 Jan 2026 19:50:49 GMT
Title: On the Limits of Self-Improving in LLMs and Why AGI, ASI and the Singularity Are Not Near Without Symbolic Model Synthesis
Authors: Hector Zenil,
Abstract summary: We formalise self-training in Large Language Models (LLMs) and Generative AI as a discrete-time dynamical system.<n>We derive two fundamental failure modes: (1) Entropy Decay, where finite sampling effects cause a monotonic loss of distributional diversity (mode collapse), and (2) Variance Amplification, where the loss of external grounding causes the model's representation of truth to drift as a random walk.
Score: 0.01269104766024433
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We formalise recursive self-training in Large Language Models (LLMs) and Generative AI as a discrete-time dynamical system and prove that, as training data become increasingly self-generated ($α_t \to 0$), the system undergoes inevitably degenerative dynamics. We derive two fundamental failure modes: (1) Entropy Decay, where finite sampling effects cause a monotonic loss of distributional diversity (mode collapse), and (2) Variance Amplification, where the loss of external grounding causes the model's representation of truth to drift as a random walk, bounded only by the support diameter. We show these behaviours are not contingent on architecture but are consequences of distributional learning on finite samples. We further argue that Reinforcement Learning with imperfect verifiers suffers similar semantic collapse. To overcome these limits, we propose a path involving symbolic regression and program synthesis guided by Algorithmic Probability. The Coding Theorem Method (CTM) allows for identifying generative mechanisms rather than mere correlations, escaping the data-processing inequality that binds standard statistical learning. We conclude that while purely distributional learning leads to model collapse, hybrid neurosymbolic approaches offer a coherent framework for sustained self-improvement.

Related papers

SIGMA: Scalable Spectral Insights for LLM Collapse [51.863164847253366]
We introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework for model collapse.<n>By utilizing benchmarks that deriving and deterministic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space.<n>We demonstrate that SIGMA effectively captures the transition towards states, offering both theoretical insights into the mechanics of collapse.
arXiv Detail & Related papers (2026-01-06T19:47:11Z)
Disordered Dynamics in High Dimensions: Connections to Random Matrices and Machine Learning [52.26396748560348]
We provide an overview of high dimensional dynamical systems driven by random matrices.<n>We focus on applications to simple models of learning and generalization in machine learning theory.
arXiv Detail & Related papers (2026-01-03T00:12:32Z)
Towards Unsupervised Causal Representation Learning via Latent Additive Noise Model Causal Autoencoders [1.9732490977700972]
Unsupervised representation learning seeks to recover latent generative factors.<n>Disentangling causal variables from observational data is impossible without supervision.<n>We propose the Latent Additive Noise Model Causal Autoencoder (LANCA) as a strong inductive bias for unsupervised discovery.
arXiv Detail & Related papers (2025-12-15T10:52:30Z)
Provable Benefit of Curriculum in Transformer Tree-Reasoning Post-Training [76.12556589212666]
We show that curriculum post-training avoids the exponential complexity bottleneck.<n>Under outcome-only reward signals, reinforcement learning finetuning achieves high accuracy with sample complexity.<n>We establish guarantees for test-time scaling, where curriculum-aware querying reduces both reward oracle calls and sampling cost from exponential to order.
arXiv Detail & Related papers (2025-11-10T18:29:54Z)
Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z)
Ascent Fails to Forget [45.75497227694833]
We show that gradient ascent-based unconstrained optimization methods frequently fail to perform machine unlearning.<n>We attribute this phenomenon to the inherent statistical dependence between the forget and retain data sets.<n>Our findings highlight that the presence of such statistical dependencies, even when manifest only as correlations, can be sufficient for ascent-based unlearning to fail.
arXiv Detail & Related papers (2025-09-30T15:48:49Z)
Information-Theoretic Bounds and Task-Centric Learning Complexity for Real-World Dynamic Nonlinear Systems [0.6875312133832079]
Dynamic nonlinear systems exhibit distortions arising from coupled static and dynamic effects.<n>This paper presents a theoretical framework grounded in structured decomposition, variance analysis, and task-centric complexity bounds.
arXiv Detail & Related papers (2025-09-08T12:08:02Z)
Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training [2.094557609248011]
Large language models increasingly rely on synthetic data due to human-written content scarcity.<n>Recursive training on model-generated outputs leads to model collapse, a degenerative process threatening factual reliability.
arXiv Detail & Related papers (2025-09-05T04:29:15Z)
The Theory of the Unique Latent Pattern: A Formal Epistemic Framework for Structural Singularity in Complex Systems [2.44755919161855]
This paper introduces the Theory of the Unique Latent Pattern (ULP), a formal framework that redefines the origin of apparent complexity in dynamic systems.<n>Rather than attributing unpredictability to intrinsic randomness or emergent nonlinearity, ULP asserts that every analyzable system is governed by a structurally unique, deterministic generative mechanism.
arXiv Detail & Related papers (2025-05-24T19:52:28Z)
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops [55.07063067759609]
High-quality data is essential for training large generative models, yet the vast reservoir of real data available online has become nearly depleted.<n>Models increasingly generate their own data for further training, forming Self-consuming Training Loops (STLs)<n>Some models degrade or even collapse, while others successfully avoid these failures, leaving a significant gap in theoretical understanding.
arXiv Detail & Related papers (2025-02-26T06:18:13Z)
Disentangling Observed Causal Effects from Latent Confounders using Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions. We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.