Entropy-based Stability-Plasticity for Lifelong Learning
- URL: http://arxiv.org/abs/2204.09517v1
- Date: Mon, 18 Apr 2022 22:58:49 GMT
- Title: Entropy-based Stability-Plasticity for Lifelong Learning
- Authors: Vladimir Araujo, Julio Hurtado, Alvaro Soto, Marie-Francine Moens
- Abstract summary: We propose Entropy-based Stability-Plasticity (ESP) to address the stability-plasticity dilemma in neural networks.
Our approach can decide dynamically how much each model layer should be modified via a plasticity factor.
In some cases, it is possible to freeze layers during training leading to speed up in training.
- Score: 17.40355682488805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to continuously learn remains elusive for deep learning models.
Unlike humans, models cannot accumulate knowledge in their weights when
learning new tasks, mainly due to an excess of plasticity and the low incentive
to reuse weights when training a new task. To address the stability-plasticity
dilemma in neural networks, we propose a novel method called Entropy-based
Stability-Plasticity (ESP). Our approach can decide dynamically how much each
model layer should be modified via a plasticity factor. We incorporate branch
layers and an entropy-based criterion into the model to find such factor. Our
experiments in the domains of natural language and vision show the
effectiveness of our approach in leveraging prior knowledge by reducing
interference. Also, in some cases, it is possible to freeze layers during
training leading to speed up in training.
Related papers
- Neuroplastic Expansion in Deep Reinforcement Learning [9.297543779239826]
We propose a novel approach, Neuroplastic Expansion (NE), inspired by cortical expansion in cognitive science.
NE maintains learnability and adaptability throughout the entire training process by dynamically growing the network from a smaller initial size to its full dimension.
Our method is designed with three key components: (1) elastic neuron generation based on potential gradients, (2) dormant neuron pruning to optimize network expressivity, and (3) neuron consolidation via experience review.
arXiv Detail & Related papers (2024-10-10T14:51:14Z) - Neuromimetic metaplasticity for adaptive continual learning [2.1749194587826026]
We propose a metaplasticity model inspired by human working memory to achieve catastrophic forgetting-free continual learning.
A key aspect of our approach involves implementing distinct types of synapses from stable to flexible, and randomly intermixing them to train synaptic connections with different degrees of flexibility.
The model achieved a balanced tradeoff between memory capacity and performance without requiring additional training or structural modifications.
arXiv Detail & Related papers (2024-07-09T12:21:35Z) - Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning [113.89327264634984]
Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples.
Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially.
We propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation.
arXiv Detail & Related papers (2024-07-08T17:09:39Z) - InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning [12.004172212239848]
Continual learning requires the model to learn multiple tasks sequentially.
In this work, we propose a new PEFT method, called interference-free low-rank adaptation (InfLoRA) for continual learning.
arXiv Detail & Related papers (2024-03-30T03:16:37Z) - Exploring Model Transferability through the Lens of Potential Energy [78.60851825944212]
Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models.
Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels.
We present an insightful physics-inspired approach named PED to address these challenges.
arXiv Detail & Related papers (2023-08-29T07:15:57Z) - New Insights on Relieving Task-Recency Bias for Online Class Incremental
Learning [37.888061221999294]
In all settings, the online class incremental learning (OCIL) is more challenging and can be encountered more frequently in real world.
To strike a preferable trade-off between stability and plasticity, we propose an Adaptive Focus Shifting algorithm.
arXiv Detail & Related papers (2023-02-16T11:52:00Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - A Spiking Neuron Synaptic Plasticity Model Optimized for Unsupervised
Learning [0.0]
Spiking neural networks (SNN) are considered as a perspective basis for performing all kinds of learning tasks - unsupervised, supervised and reinforcement learning.
Learning in SNN is implemented through synaptic plasticity - the rules which determine dynamics of synaptic weights depending usually on activity of the pre- and post-synaptic neurons.
arXiv Detail & Related papers (2021-11-12T15:26:52Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.