Online Continual Learning via the Meta-learning Update with Multi-scale
Knowledge Distillation and Data Augmentation
- URL: http://arxiv.org/abs/2209.06107v1
- Date: Mon, 12 Sep 2022 10:03:53 GMT
- Title: Online Continual Learning via the Meta-learning Update with Multi-scale
Knowledge Distillation and Data Augmentation
- Authors: Ya-nan Han, Jian-wei Liu
- Abstract summary: Continual learning aims to rapidly and continually learn the current task from a sequence of tasks.
One common limitation of this method is the data imbalance between the previous and current tasks.
We propose a novel framework called Meta-learning update via Multi-scale Knowledge Distillation and Data Augmentation.
- Score: 4.109784267309124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning aims to rapidly and continually learn the current task
from a sequence of tasks.
Compared to other kinds of methods, the methods based on experience replay
have shown great advantages to overcome catastrophic forgetting. One common
limitation of this method is the data imbalance between the previous and
current tasks, which would further aggravate forgetting. Moreover, how to
effectively address the stability-plasticity dilemma in this setting is also an
urgent problem to be solved. In this paper, we overcome these challenges by
proposing a novel framework called Meta-learning update via Multi-scale
Knowledge Distillation and Data Augmentation (MMKDDA). Specifically, we apply
multiscale knowledge distillation to grasp the evolution of long-range and
short-range spatial relationships at different feature levels to alleviate the
problem of data imbalance. Besides, our method mixes the samples from the
episodic memory and current task in the online continual training procedure,
thus alleviating the side influence due to the change of probability
distribution. Moreover, we optimize our model via the meta-learning update
resorting to the number of tasks seen previously, which is helpful to keep a
better balance between stability and plasticity. Finally, our experimental
evaluation on four benchmark datasets shows the effectiveness of the proposed
MMKDDA framework against other popular baselines, and ablation studies are also
conducted to further analyze the role of each component in our framework.
Related papers
- Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction [53.88231294380083]
We introduce a novel Multi-Epoch learning with Data Augmentation (MEDA) framework, suitable for both non-continual and continual learning scenarios.
MEDA minimizes overfitting by reducing the dependency of the embedding layer on subsequent training data.
Our findings confirm that pre-trained layers can adapt to new embedding spaces, enhancing performance without overfitting.
arXiv Detail & Related papers (2024-06-27T04:00:15Z) - Improving Data-aware and Parameter-aware Robustness for Continual Learning [3.480626767752489]
This paper analyzes that this insufficiency arises from the ineffective handling of outliers.
We propose a Robust Continual Learning (RCL) method to address this issue.
The proposed method effectively maintains robustness and achieves new state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2024-05-27T11:21:26Z) - Elastic Multi-Gradient Descent for Parallel Continual Learning [28.749215705746135]
We study the novel paradigm of Parallel Continual Learning (PCL) in dynamic multi-task scenarios.
PCL presents challenges due to the training of an unspecified number of tasks with varying learning progress.
We propose a memory editing mechanism guided by the gradient computed using EMGD to balance the training between old and new tasks.
arXiv Detail & Related papers (2024-01-02T06:26:25Z) - Order Matters in the Presence of Dataset Imbalance for Multilingual
Learning [53.74649778447903]
We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks.
We show its improvements in neural machine translation (NMT) and multi-lingual language modeling.
arXiv Detail & Related papers (2023-12-11T05:46:57Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - New metrics for analyzing continual learners [27.868967961503962]
Continual Learning (CL) poses challenges to standard learning algorithms.
This stability-plasticity dilemma remains central to CL and multiple metrics have been proposed to adequately measure stability and plasticity separately.
We propose new metrics that account for the task's increasing difficulty.
arXiv Detail & Related papers (2023-09-01T13:53:33Z) - Dissecting Continual Learning a Structural and Data Analysis [0.0]
Continual Learning is a field dedicated to devise algorithms able to achieve lifelong learning.
Deep learning methods can attain impressive results when the data modeled does not undergo a considerable distributional shift in subsequent learning sessions.
When we expose such systems to this incremental setting, performance drop very quickly.
arXiv Detail & Related papers (2023-01-03T10:37:11Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in
Continual Learning [14.226973149346883]
Acquiring knowledge about new data samples without forgetting previous ones is a critical problem of continual learning.
We propose a new method for unsupervised continual knowledge consolidation in generative models that relies on the partitioning of Variational Autoencoder's latent space.
On top of the standard continual learning evaluation benchmarks, we evaluate our method on a new knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold.
arXiv Detail & Related papers (2021-06-23T06:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.