Adaptive Variance-Penalized Continual Learning with Fisher Regularization
- URL: http://arxiv.org/abs/2508.16632v1
- Date: Fri, 15 Aug 2025 21:49:28 GMT
- Title: Adaptive Variance-Penalized Continual Learning with Fisher Regularization
- Authors: Krisanu Sarkar,
- Abstract summary: This work presents a novel continual learning framework that integrates Fisher-weighted asymmetric regularization of parameter variances.<n>Our method dynamically modulates regularization intensity according to parameter uncertainty, achieving enhanced stability and performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The persistent challenge of catastrophic forgetting in neural networks has motivated extensive research in continual learning . This work presents a novel continual learning framework that integrates Fisher-weighted asymmetric regularization of parameter variances within a variational learning paradigm. Our method dynamically modulates regularization intensity according to parameter uncertainty, achieving enhanced stability and performance. Comprehensive evaluations on standard continual learning benchmarks including SplitMNIST, PermutedMNIST, and SplitFashionMNIST demonstrate substantial improvements over existing approaches such as Variational Continual Learning and Elastic Weight Consolidation . The asymmetric variance penalty mechanism proves particularly effective in maintaining knowledge across sequential tasks while improving model accuracy. Experimental results show our approach not only boosts immediate task performance but also significantly mitigates knowledge degradation over time, effectively addressing the fundamental challenge of catastrophic forgetting in neural networks
Related papers
- Learning to be Reproducible: Custom Loss Design for Robust Neural Networks [4.3094059981414405]
We propose a Custom Loss Function (CLF) that balances predictive accuracy with training stability.<n>CLF significantly improves training without sacrificing predictive performance.<n>These results establish CLF as an effective and efficient strategy for developing more stable, reliable and trustworthy neural networks.
arXiv Detail & Related papers (2026-01-02T05:31:08Z) - On the Stability of Neural Networks in Deep Learning [3.843574434245427]
This thesis examines how neural networks respond to perturbations at both the input and parameter levels.<n>We study Lipschitz networks as a principled way to constrain sensitivity to perturbations, thereby improving generalization, adversarial robustness, and training stability.
arXiv Detail & Related papers (2025-10-29T08:38:43Z) - MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics [72.00014675808228]
Instability in Large Language Models evaluation process obscures true learning dynamics.<n>We introduce textbfMaP, a framework that integrates underlineMerging underlineand the underlinePass@k metric.<n>Experiments show that MaP yields significantly smoother performance curves, reduces inter-run variance, and ensures more consistent rankings.
arXiv Detail & Related papers (2025-10-10T11:40:27Z) - Noradrenergic-inspired gain modulation attenuates the stability gap in joint training [44.99833362998488]
Studies in continual learning have identified a transient drop in performance on mastered tasks when assimilating new ones, known as the stability gap.<n>We argue that it reflects an imbalance between rapid adaptation and robust retention at task boundaries.<n>Inspired by locus coeruleus mediated noradrenergic bursts, we propose uncertainty-modulated gain dynamics.
arXiv Detail & Related papers (2025-07-18T16:34:06Z) - Overcoming catastrophic forgetting in neural networks [0.0]
Catastrophic forgetting is the primary challenge that hinders continual learning.<n> Elastic Weight Consolidation is a regularization-based approach inspired by synaptic consolidation in biological neural systems.<n>Our results confirm what was shown in previous research, showing that EWC significantly reduces forgetting compared to naive training.
arXiv Detail & Related papers (2025-07-14T17:04:05Z) - Continual Learning in Vision-Language Models via Aligned Model Merging [84.47520899851557]
We present a new perspective based on model merging to maintain stability while still retaining plasticity.<n>To maximize the effectiveness of the merging process, we propose a simple mechanism that promotes learning aligned weights with previous ones.
arXiv Detail & Related papers (2025-05-30T20:52:21Z) - Temporal-Difference Variational Continual Learning [89.32940051152782]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Regularization for Adversarial Robust Learning [18.46110328123008]
We develop a novel approach to adversarial training that integrates $phi$-divergence regularization into the distributionally robust risk function.
This regularization brings a notable improvement in computation compared with the original formulation.
We validate our proposed method in supervised learning, reinforcement learning, and contextual learning and showcase its state-of-the-art performance against various adversarial attacks.
arXiv Detail & Related papers (2024-08-19T03:15:41Z) - Learning Continually by Spectral Regularization [45.55508032009977]
Continual learning algorithms seek to mitigate loss of plasticity by sustaining good performance while maintaining network trainability.
We develop a new technique for improving continual learning inspired by the observation that the singular values of the neural network parameters at initialization are an important factor for trainability during early phases of learning.
We present an experimental analysis that shows how the proposed spectral regularizer can sustain trainability and performance across a range of model architectures in continual supervised and reinforcement learning settings.
arXiv Detail & Related papers (2024-06-10T21:34:43Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem)
AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient.
Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.