Related papers: Training instability in deep learning follows low-dimensional dynamical principles

Training instability in deep learning follows low-dimensional dynamical principles

URL: http://arxiv.org/abs/2601.13160v1
Date: Mon, 19 Jan 2026 15:37:45 GMT
Title: Training instability in deep learning follows low-dimensional dynamical principles
Authors: Zhipeng Zhang, Zhenjie Yao, Kai Li, Lei Yang,
Abstract summary: Training unfolds as a high-dimensional dynamical system in which small perturbations to optimization, data, parameters, or learning signals can induce abrupt and irreversible collapse.<n>We propose a unified dynamical perspective that characterizes training stability as an intrinsic property of learning systems.
Score: 24.97566911521709
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning systems achieve remarkable empirical performance, yet the stability of the training process itself remains poorly understood. Training unfolds as a high-dimensional dynamical system in which small perturbations to optimization, data, parameters, or learning signals can induce abrupt and irreversible collapse, undermining reproducibility and scalability. We propose a unified dynamical perspective that characterizes training stability as an intrinsic property of learning systems, organized along four interacting dimensions: optimization, environmental/data, parametric, and learning-signal stability. We operationalize this perspective through controlled perturbation auditing of training trajectories, probing how learning dynamics respond to structured disturbances without modifying learning algorithms. Across reinforcement learning and large language model training, we identify three recurring regularities: high final performance is frequently decoupled from training stability; controlled stochasticity consistently buffers learning dynamics across paradigms; and deviations in low-dimensional latent meta-states systematically precede observable performance collapse. Together, these findings establish training stability as a measurable and comparable dynamical property of learning systems, providing a descriptive foundation for studying learning dynamics beyond final performance outcomes.

Related papers

Degradation of Feature Space in Continual Learning [2.322400467239964]
We investigate whether promoting feature-space isotropy can enhance representation quality in continual learning.<n>We find that isotropic regularization fails to improve, and can in fact degrade, model accuracy in continual settings.
arXiv Detail & Related papers (2026-02-06T10:26:34Z)
Analytic and Variational Stability of Deep Learning Systems [0.0]
We show that uniform boundedness of stability signatures is equivalent to the existence of a Lyapunov-type energy that dissipates along the learning flow.<n>In smooth regimes, the framework yields explicit stability exponents linking spectral norms, activation regularity, step sizes, and learning rates to contractivity of the learning dynamics.<n>The theory extends to non-smooth learning systems, including ReLU networks, proximal and projected updates, and subgradient flows.
arXiv Detail & Related papers (2025-12-24T14:43:59Z)
Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning [58.533203990515034]
Scaling neural networks has driven breakthrough advances in machine learning, yet this paradigm fails in deep reinforcement learning (DRL)<n>We show that dynamic sparse training strategies provide module-specific benefits that complement the primary scalability foundation established by architectural improvements.<n>We finally distill these insights into Module-Specific Training (MST), a practical framework that exploits the benefits of architectural improvements and demonstrates substantial scalability gains across diverse RL algorithms without algorithmic modifications.
arXiv Detail & Related papers (2025-10-14T03:03:08Z)
Adaptive Variance-Penalized Continual Learning with Fisher Regularization [0.0]
This work presents a novel continual learning framework that integrates Fisher-weighted asymmetric regularization of parameter variances.<n>Our method dynamically modulates regularization intensity according to parameter uncertainty, achieving enhanced stability and performance.
arXiv Detail & Related papers (2025-08-15T21:49:28Z)
The Importance of Being Lazy: Scaling Limits of Continual Learning [60.97756735877614]
We show that increasing model width is only beneficial when it reduces the amount of feature learning, yielding more laziness.<n>We study the intricate relationship between feature learning, task non-stationarity, and forgetting, finding that high feature learning is only beneficial with highly similar tasks.
arXiv Detail & Related papers (2025-06-20T10:12:38Z)
Dynamic Manipulation of Deformable Objects in 3D: Simulation, Benchmark and Learning Strategy [88.8665000676562]
Prior methods often simplify the problem to low-speed or 2D settings, limiting their applicability to real-world 3D tasks.<n>To mitigate data scarcity, we introduce a novel simulation framework and benchmark grounded in reduced-order dynamics.<n>We propose Dynamics Informed Diffusion Policy (DIDP), a framework that integrates imitation pretraining with physics-informed test-time adaptation.
arXiv Detail & Related papers (2025-05-23T03:28:25Z)
Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity [51.40558987254471]
Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations. This paper addresses the question of reinforcement learning under $textitgeneral$ latent dynamics from a statistical and algorithmic perspective.
arXiv Detail & Related papers (2024-10-23T14:22:49Z)
Data-Driven Control with Inherent Lyapunov Stability [3.695480271934742]
We propose Control with Inherent Lyapunov Stability (CoILS) as a method for jointly learning parametric representations of a nonlinear dynamics model and a stabilizing controller from data. In addition to the stabilizability of the learned dynamics guaranteed by our novel construction, we show that the learned controller stabilizes the true dynamics under certain assumptions on the fidelity of the learned dynamics.
arXiv Detail & Related papers (2023-03-06T14:21:42Z)
Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control [46.81433026280051]
We present a self-supervised learning approach that actively models the dynamics of nonlinear robotic systems. Our approach showcases high resilience and generalization capabilities by consistently adapting to unseen flight conditions.
arXiv Detail & Related papers (2022-10-23T00:45:05Z)
Training Generative Adversarial Networks by Solving Ordinary Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training. From this perspective, we hypothesise that instabilities in training GANs arise from the integration error. We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z)
Learning Unstable Dynamical Systems with Time-Weighted Logarithmic Loss [20.167719985846002]
We look into the dynamics of the gradient descent algorithm and pinpoint what causes the difficulty of learning unstable systems. We introduce a time-weighted logarithmic loss function to fix this imbalance and demonstrate its effectiveness in learning unstable systems.
arXiv Detail & Related papers (2020-07-10T06:28:05Z)
Learning Stable Deep Dynamics Models [91.90131512825504]
We propose an approach for learning dynamical systems that are guaranteed to be stable over the entire state space. We show that such learning systems are able to model simple dynamical systems and can be combined with additional deep generative models to learn complex dynamics.
arXiv Detail & Related papers (2020-01-17T00:04:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.