Related papers: Entropy-based Stability-Plasticity for Lifelong Learning

Entropy-based Stability-Plasticity for Lifelong Learning

URL: http://arxiv.org/abs/2204.09517v1
Date: Mon, 18 Apr 2022 22:58:49 GMT
Title: Entropy-based Stability-Plasticity for Lifelong Learning
Authors: Vladimir Araujo, Julio Hurtado, Alvaro Soto, Marie-Francine Moens
Abstract summary: We propose Entropy-based Stability-Plasticity (ESP) to address the stability-plasticity dilemma in neural networks. Our approach can decide dynamically how much each model layer should be modified via a plasticity factor. In some cases, it is possible to freeze layers during training leading to speed up in training.
Score: 17.40355682488805
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The ability to continuously learn remains elusive for deep learning models. Unlike humans, models cannot accumulate knowledge in their weights when learning new tasks, mainly due to an excess of plasticity and the low incentive to reuse weights when training a new task. To address the stability-plasticity dilemma in neural networks, we propose a novel method called Entropy-based Stability-Plasticity (ESP). Our approach can decide dynamically how much each model layer should be modified via a plasticity factor. We incorporate branch layers and an entropy-based criterion into the model to find such factor. Our experiments in the domains of natural language and vision show the effectiveness of our approach in leveraging prior knowledge by reducing interference. Also, in some cases, it is possible to freeze layers during training leading to speed up in training.

Related papers

The Importance of Being Lazy: Scaling Limits of Continual Learning [60.97756735877614]
We show that increasing model width is only beneficial when it reduces the amount of feature learning, yielding more laziness.<n>We study the intricate relationship between feature learning, task non-stationarity, and forgetting, finding that high feature learning is only beneficial with highly similar tasks.
arXiv Detail & Related papers (2025-06-20T10:12:38Z)
Partitioned Memory Storage Inspired Few-Shot Class-Incremental learning [2.9845592719739127]
Few-Shot Class-Incremental Learning (FSCIL) focuses on continuous learning of new categories with limited samples without forgetting old knowledge. Our paper develops a method that learns independent models for each session. It can inherently prevent catastrophic forgetting. Our method provides a fresh viewpoint for FSCIL and demonstrates the state-of-the-art performance on CIFAR-100 and mini-ImageNet datasets.
arXiv Detail & Related papers (2025-04-29T14:11:06Z)
Neural Networks Remember More: The Power of Parameter Isolation and Combination [3.2430260063115233]
Catastrophic forgetting is a pervasive issue for pre-trained language models. Key to solving this problem is to find a trade-off between the plasticity and stability of the model. We propose a novel method to achieve a balance between model stability and plasticity.
arXiv Detail & Related papers (2025-02-16T02:58:57Z)
Neuroplastic Expansion in Deep Reinforcement Learning [9.297543779239826]
We propose a novel approach, Neuroplastic Expansion (NE), inspired by cortical expansion in cognitive science. NE maintains learnability and adaptability throughout the entire training process by dynamically growing the network from a smaller initial size to its full dimension. Our method is designed with three key components: (1) elastic neuron generation based on potential gradients, (2) dormant neuron pruning to optimize network expressivity, and (3) neuron consolidation via experience review.
arXiv Detail & Related papers (2024-10-10T14:51:14Z)
Neuromimetic metaplasticity for adaptive continual learning [2.1749194587826026]
We propose a metaplasticity model inspired by human working memory to achieve catastrophic forgetting-free continual learning. A key aspect of our approach involves implementing distinct types of synapses from stable to flexible, and randomly intermixing them to train synaptic connections with different degrees of flexibility. The model achieved a balanced tradeoff between memory capacity and performance without requiring additional training or structural modifications.
arXiv Detail & Related papers (2024-07-09T12:21:35Z)
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [113.89327264634984]
Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially. We propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation.
arXiv Detail & Related papers (2024-07-08T17:09:39Z)
InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning [12.004172212239848]
Continual learning requires the model to learn multiple tasks sequentially. In this work, we propose a new PEFT method, called interference-free low-rank adaptation (InfLoRA) for continual learning.
arXiv Detail & Related papers (2024-03-30T03:16:37Z)
Exploring Model Transferability through the Lens of Potential Energy [78.60851825944212]
Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models. Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels. We present an insightful physics-inspired approach named PED to address these challenges.
arXiv Detail & Related papers (2023-08-29T07:15:57Z)
New Insights on Relieving Task-Recency Bias for Online Class Incremental Learning [37.888061221999294]
In all settings, the online class incremental learning (OCIL) is more challenging and can be encountered more frequently in real world. To strike a preferable trade-off between stability and plasticity, we propose an Adaptive Focus Shifting algorithm.
arXiv Detail & Related papers (2023-02-16T11:52:00Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
A Spiking Neuron Synaptic Plasticity Model Optimized for Unsupervised Learning [0.0]
Spiking neural networks (SNN) are considered as a perspective basis for performing all kinds of learning tasks - unsupervised, supervised and reinforcement learning. Learning in SNN is implemented through synaptic plasticity - the rules which determine dynamics of synaptic weights depending usually on activity of the pre- and post-synaptic neurons.
arXiv Detail & Related papers (2021-11-12T15:26:52Z)
Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks. We explore the diversity of the neurons within the hidden layer during the learning process. We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z)
Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially. We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.