Automatic Stability and Recovery for Neural Network Training
- URL: http://arxiv.org/abs/2601.17483v1
- Date: Sat, 24 Jan 2026 15:14:54 GMT
- Title: Automatic Stability and Recovery for Neural Network Training
- Authors: Barak Or,
- Abstract summary: Training modern neural networks is increasingly fragile, with rare but severe destabilizing updates often causing irreversible divergence or silent degradation.<n>Existing optimization methods rely on preventive mechanisms embedded within stability probes, offering limited ability to detect and recover from instability.<n>We introduce a supervisory framework that treats optimization as a controlled runtime process.
- Score: 1.9544213396776273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training modern neural networks is increasingly fragile, with rare but severe destabilizing updates often causing irreversible divergence or silent performance degradation. Existing optimization methods primarily rely on preventive mechanisms embedded within the optimizer, offering limited ability to detect and recover from instability once it occurs. We introduce a supervisory runtime stability framework that treats optimization as a controlled stochastic process. By isolating an innovation signal derived from secondary measurements, such as validation probes, the framework enables automatic detection and recovery from destabilizing updates without modifying the underlying optimizer. We provide theoretical runtime safety guarantees that formalize bounded degradation and recovery. Our implementation incurs minimal overhead and is compatible with memory-constrained training settings.
Related papers
- ModalImmune: Immunity Driven Unlearning via Self Destructive Training [21.940530514137947]
ModalImmune enforces modality immunity by intentionally collapsing selected modality information during training.<n> framework combines a spectrum-adaptive collapse regularizer, an information-gain guided controller for targeted interventions, curvature-aware gradient masking to stabilize destructive updates.<n> Empirical evaluation on standard multimodal benchmarks demonstrates that ModalImmune improves resilience to modality removal and corruption while retaining convergence stability and reconstruction capacity.
arXiv Detail & Related papers (2026-02-18T05:35:32Z) - Safe Reinforcement Learning via Recovery-based Shielding with Gaussian Process Dynamics Models [57.006252510102506]
Reinforcement learning (RL) is a powerful framework for optimal decision-making and control but often lacks provable guarantees for safety-critical applications.<n>We introduce a novel recovery-based shielding framework that enables safe RL with a provable safety lower bound for unknown and non-linear continuous dynamical systems.
arXiv Detail & Related papers (2026-02-12T22:03:35Z) - TeleBoost: A Systematic Alignment Framework for High-Fidelity, Controllable, and Robust Video Generation [45.864084191741135]
Post-training is the decisive step for converting a pretrained video generator into a production-oriented model.<n>This report presents a systematical post-training framework that organizes supervised policy shaping, reward-driven reinforcement learning, and preference-based refinement.
arXiv Detail & Related papers (2026-02-07T15:49:25Z) - Stability as a Liability:Systematic Breakdown of Linguistic Structure in LLMs [5.96875296117642]
We show that stable parameter trajectories lead stationary solutions to minimize the forward KL divergence to the empirical distribution.<n>We empirically validate this effect using a controlled feedback-based training framework.<n>It indicates that optimization stability and generative expressivity are not inherently aligned, and that stability alone is an insufficient indicator of generative quality.
arXiv Detail & Related papers (2026-01-26T15:34:50Z) - GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping [63.33669214116784]
GRPO-Guard is a simple yet effective enhancement to existing GRPO frameworks.<n>It restores a balanced and step-consistent importance ratio, ensuring that PPO clipping properly constrains harmful updates.<n>It substantially mitigates implicit over-optimization without relying on heavy KL regularization.
arXiv Detail & Related papers (2025-10-25T14:51:17Z) - NIRVANA: Structured pruning reimagined for large language models compression [50.651730342011014]
We introduce NIRVANA, a novel pruning method designed to balance immediate zero-shot preservation accuracy with robust fine-tuning.<n>To further address the unique challenges posed by structured pruning, NIRVANA incorporates an adaptive sparsity allocation mechanism across layers and modules.<n>Experiments conducted on Llama3, Qwen, T5 models demonstrate that NIRVANA outperforms existing structured pruning methods under equivalent sparsity constraints.
arXiv Detail & Related papers (2025-09-17T17:59:00Z) - Convergence and Generalization of Anti-Regularization for Parametric Models [0.0]
Anti-regularization introduces a reward term with a reversed sign into the loss function.<n>We formalize spectral safety conditions and trust-region constraints.<n>We design a lightweight safeguard that combines a projection operator with gradient clipping to guarantee stable intervention.
arXiv Detail & Related papers (2025-08-24T15:34:17Z) - Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory [0.18749305679160366]
We introduce "recovery time" as a new metric of resilience in order to quantify the time required for an LSTM to return to its normal state after anomalous inputs.<n> Experimental validation on simple models demonstrates the effectiveness of our resilience estimation and control methods.
arXiv Detail & Related papers (2025-05-23T10:05:26Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Exploiting Diffusion Prior for Real-World Image Super-Resolution [75.5898357277047]
We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution.
By employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model.
arXiv Detail & Related papers (2023-05-11T17:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.