Kaizen: Practical Self-supervised Continual Learning with Continual
Fine-tuning
- URL: http://arxiv.org/abs/2303.17235v2
- Date: Wed, 7 Feb 2024 15:45:43 GMT
- Title: Kaizen: Practical Self-supervised Continual Learning with Continual
Fine-tuning
- Authors: Chi Ian Tang, Lorena Qendro, Dimitris Spathis, Fahim Kawsar, Cecilia
Mascolo, Akhil Mathur
- Abstract summary: Retraining a model from scratch to adapt to newly generated data is time-consuming and inefficient.
We introduce a training architecture that is able to mitigate catastrophic forgetting.
Kaizen significantly outperforms previous SSL models in competitive vision benchmarks.
- Score: 21.36130180647864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning (SSL) has shown remarkable performance in computer
vision tasks when trained offline. However, in a Continual Learning (CL)
scenario where new data is introduced progressively, models still suffer from
catastrophic forgetting. Retraining a model from scratch to adapt to newly
generated data is time-consuming and inefficient. Previous approaches suggested
re-purposing self-supervised objectives with knowledge distillation to mitigate
forgetting across tasks, assuming that labels from all tasks are available
during fine-tuning. In this paper, we generalize self-supervised continual
learning in a practical setting where available labels can be leveraged in any
step of the SSL process. With an increasing number of continual tasks, this
offers more flexibility in the pre-training and fine-tuning phases. With
Kaizen, we introduce a training architecture that is able to mitigate
catastrophic forgetting for both the feature extractor and classifier with a
carefully designed loss function. By using a set of comprehensive evaluation
metrics reflecting different aspects of continual learning, we demonstrated
that Kaizen significantly outperforms previous SSL models in competitive vision
benchmarks, with up to 16.5% accuracy improvement on split CIFAR-100. Kaizen is
able to balance the trade-off between knowledge retention and learning from new
data with an end-to-end model, paving the way for practical deployment of
continual learning systems.
Related papers
- Temporal-Difference Variational Continual Learning [89.32940051152782]
A crucial capability of Machine Learning models in real-world applications is the ability to continuously learn new tasks.
In Continual Learning settings, models often struggle to balance learning new tasks with retaining previous knowledge.
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - DELTA: Decoupling Long-Tailed Online Continual Learning [7.507868991415516]
Long-Tailed Online Continual Learning (LTOCL) aims to learn new tasks from sequentially arriving class-imbalanced data streams.
We present DELTA, a decoupled learning approach designed to enhance learning representations.
We demonstrate that DELTA improves the capacity for incremental learning, surpassing existing OCL methods.
arXiv Detail & Related papers (2024-04-06T02:33:04Z) - Dynamic Sub-graph Distillation for Robust Semi-supervised Continual
Learning [52.046037471678005]
We focus on semi-supervised continual learning (SSCL), where the model progressively learns from partially labeled data with unknown categories.
We propose a novel approach called Dynamic Sub-Graph Distillation (DSGD) for semi-supervised continual learning.
arXiv Detail & Related papers (2023-12-27T04:40:12Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Continual Learning with Pretrained Backbones by Tuning in the Input
Space [44.97953547553997]
The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks.
We propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters.
arXiv Detail & Related papers (2023-06-05T15:11:59Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Continual Learning From Unlabeled Data Via Deep Clustering [7.704949298975352]
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives.
We introduce a new framework to make continual learning feasible in unsupervised mode by using pseudo label obtained from cluster assignments to update model.
arXiv Detail & Related papers (2021-04-14T23:46:17Z) - Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot
Learning [82.07273754143547]
We propose a meta-continual zero-shot learning (MCZSL) approach to generalizing a model to categories unseen during training.
By pairing self-gating of attributes and scaled class normalization with meta-learning based training, we are able to outperform state-of-the-art results.
arXiv Detail & Related papers (2021-02-23T18:36:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.