Online Continual Learning via the Meta-learning Update with Multi-scale
Knowledge Distillation and Data Augmentation
- URL: http://arxiv.org/abs/2209.06107v1
- Date: Mon, 12 Sep 2022 10:03:53 GMT
- Title: Online Continual Learning via the Meta-learning Update with Multi-scale
Knowledge Distillation and Data Augmentation
- Authors: Ya-nan Han, Jian-wei Liu
- Abstract summary: Continual learning aims to rapidly and continually learn the current task from a sequence of tasks.
One common limitation of this method is the data imbalance between the previous and current tasks.
We propose a novel framework called Meta-learning update via Multi-scale Knowledge Distillation and Data Augmentation.
- Score: 4.109784267309124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning aims to rapidly and continually learn the current task
from a sequence of tasks.
Compared to other kinds of methods, the methods based on experience replay
have shown great advantages to overcome catastrophic forgetting. One common
limitation of this method is the data imbalance between the previous and
current tasks, which would further aggravate forgetting. Moreover, how to
effectively address the stability-plasticity dilemma in this setting is also an
urgent problem to be solved. In this paper, we overcome these challenges by
proposing a novel framework called Meta-learning update via Multi-scale
Knowledge Distillation and Data Augmentation (MMKDDA). Specifically, we apply
multiscale knowledge distillation to grasp the evolution of long-range and
short-range spatial relationships at different feature levels to alleviate the
problem of data imbalance. Besides, our method mixes the samples from the
episodic memory and current task in the online continual training procedure,
thus alleviating the side influence due to the change of probability
distribution. Moreover, we optimize our model via the meta-learning update
resorting to the number of tasks seen previously, which is helpful to keep a
better balance between stability and plasticity. Finally, our experimental
evaluation on four benchmark datasets shows the effectiveness of the proposed
MMKDDA framework against other popular baselines, and ablation studies are also
conducted to further analyze the role of each component in our framework.
Related papers
- Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction [53.88231294380083]
We introduce a novel Multi-Epoch learning with Data Augmentation (MEDA) framework, suitable for both non-continual and continual learning scenarios.
MEDA minimizes overfitting by reducing the dependency of the embedding layer on subsequent training data.
Our findings confirm that pre-trained layers can adapt to new embedding spaces, enhancing performance without overfitting.
arXiv Detail & Related papers (2024-06-27T04:00:15Z) - Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond [13.867793835583463]
We propose an uncertainty-aware memory-based approach to solve catastrophic forgetting.
We retrieve samples with specific characteristics, and - by retraining the model on such samples - we demonstrate the potential of this approach.
arXiv Detail & Related papers (2024-05-29T09:29:39Z) - Elastic Multi-Gradient Descent for Parallel Continual Learning [28.749215705746135]
We study the novel paradigm of Parallel Continual Learning (PCL) in dynamic multi-task scenarios.
PCL presents challenges due to the training of an unspecified number of tasks with varying learning progress.
We propose a memory editing mechanism guided by the gradient computed using EMGD to balance the training between old and new tasks.
arXiv Detail & Related papers (2024-01-02T06:26:25Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Dissecting Continual Learning a Structural and Data Analysis [0.0]
Continual Learning is a field dedicated to devise algorithms able to achieve lifelong learning.
Deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions.
When we expose such systems to this incremental setting, performance drop very quickly.
arXiv Detail & Related papers (2023-01-03T10:37:11Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in
Continual Learning [14.226973149346883]
Acquiring knowledge about new data samples without forgetting previous ones is a critical problem of continual learning.
We propose a new method for unsupervised continual knowledge consolidation in generative models that relies on the partitioning of Variational Autoencoder's latent space.
On top of the standard continual learning evaluation benchmarks, we evaluate our method on a new knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold.
arXiv Detail & Related papers (2021-06-23T06:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.