Related papers: Stability as a Liability:Systematic Breakdown of Linguistic Structure in LLMs

Stability as a Liability:Systematic Breakdown of Linguistic Structure in LLMs

URL: http://arxiv.org/abs/2601.18588v1
Date: Mon, 26 Jan 2026 15:34:50 GMT
Title: Stability as a Liability:Systematic Breakdown of Linguistic Structure in LLMs
Authors: Xianzhe Meng, Qiangsheng Zeng, Ling Luo, Qinghan Yang, Jiarui Hao, Wenbo Wu, Qinyu Wang, Rui Yin, Lin Qi, Renzhi Lu,
Abstract summary: We show that stable parameter trajectories lead stationary solutions to minimize the forward KL divergence to the empirical distribution.<n>We empirically validate this effect using a controlled feedback-based training framework.<n>It indicates that optimization stability and generative expressivity are not inherently aligned, and that stability alone is an insufficient indicator of generative quality.
Score: 5.96875296117642
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training stability is typically regarded as a prerequisite for reliable optimization in large language models. In this work, we analyze how stabilizing training dynamics affects the induced generation distribution. We show that under standard maximum likelihood training, stable parameter trajectories lead stationary solutions to approximately minimize the forward KL divergence to the empirical distribution, while implicitly reducing generative entropy. As a consequence, the learned model can concentrate probability mass on a limited subset of empirical modes, exhibiting systematic degeneration despite smooth loss convergence. We empirically validate this effect using a controlled feedback-based training framework that stabilizes internal generation statistics, observing consistent low-entropy outputs and repetitive behavior across architectures and random seeds. It indicates that optimization stability and generative expressivity are not inherently aligned, and that stability alone is an insufficient indicator of generative quality.

Related papers

Stabilizing Policy Optimization via Logits Convexity [59.242732612484474]
We show that the convexity of the supervised fine-tuning loss with respect to model logits plays a key role in enabling stable training.<n>Motivated by this observation, we propose Logits Convex Optimization (LCO), a simple yet effective policy optimization framework.
arXiv Detail & Related papers (2026-03-01T07:40:12Z)
Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models [52.48582333951919]
We propose a dynamic framework designed to enhance alignment reliability by maximizing the Signal-to-Noise Ratio of policy updates.<n>SAGE (Stability-Aware Gradient Efficiency) integrates a coarse-grained curriculum mechanism that refreshes candidate pools based on model competence.<n> Experiments on multiple mathematical reasoning benchmarks demonstrate that SAGE significantly accelerates convergence and outperforms static baselines.
arXiv Detail & Related papers (2026-02-01T12:56:10Z)
On Forgetting and Stability of Score-based Generative models [6.259598237089842]
Understanding the stability and long-time behavior of generative models is a fundamental problem in modern machine learning.<n>This paper provides quantitative bounds on the sampling error of score-based generative models by leveraging stability and forgetting properties of the Markov chain associated with the reverse-time dynamics.
arXiv Detail & Related papers (2026-01-29T15:37:50Z)
Why Smooth Stability Assumptions Fail for ReLU Learning [0.0]
We show that no uniform smoothness-based stability proxy can hold globally for ReLU networks.<n>We give a concrete counterexample demonstrating the failure of classical stability bounds.
arXiv Detail & Related papers (2025-12-26T15:17:25Z)
LILAD: Learning In-context Lyapunov-stable Adaptive Dynamics Models [4.66260462241022]
LILAD is a novel framework for system identification that jointly guarantees stability and adaptability.<n>We evaluate LILAD on benchmark autonomous systems and demonstrate that it outperforms adaptive, robust, and non-adaptive baselines in predictive accuracy.
arXiv Detail & Related papers (2025-11-26T19:20:49Z)
MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics [72.00014675808228]
Instability in Large Language Models evaluation process obscures true learning dynamics.<n>We introduce textbfMaP, a framework that integrates underlineMerging underlineand the underlinePass@k metric.<n>Experiments show that MaP yields significantly smoother performance curves, reduces inter-run variance, and ensures more consistent rankings.
arXiv Detail & Related papers (2025-10-10T11:40:27Z)
Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning [77.92320830700797]
Reinforcement Learning has played a central role in enabling reasoning capabilities of Large Language Models.<n>We propose a tractable computational framework that tracks and leverages curvature information during policy updates.<n>The algorithm, Curvature-Aware Policy Optimization (CAPO), identifies samples that contribute to unstable updates and masks them out.
arXiv Detail & Related papers (2025-10-01T12:29:32Z)
Stability Evaluation via Distributional Perturbation Analysis [28.379994938809133]
We propose a stability evaluation criterion based on distributional perturbations. Our stability evaluation criterion can address both emphdata corruptions and emphsub-population shifts. Empirically, we validate the practical utility of our stability evaluation criterion across a host of real-world applications.
arXiv Detail & Related papers (2024-05-06T06:47:14Z)
Minimax Optimal Estimation of Stability Under Distribution Shift [8.893526921869137]
We analyze the stability of a system under distribution shift. The stability measure is defined in terms of a more intuitive quantity: the level of acceptable performance degradation. Our characterization of the minimax convergence rate shows that evaluating stability against large performance degradation incurs a statistical cost.
arXiv Detail & Related papers (2022-12-13T02:40:30Z)
Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z)
Training Generative Adversarial Networks by Solving Ordinary Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training. From this perspective, we hypothesise that instabilities in training GANs arise from the integration error. We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z)
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates. This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting. To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.