Few-Shot Lifelong Learning
- URL: http://arxiv.org/abs/2103.00991v1
- Date: Mon, 1 Mar 2021 13:26:57 GMT
- Title: Few-Shot Lifelong Learning
- Authors: Pratik Mazumder, Pravendra Singh, Piyush Rai
- Abstract summary: Few-Shot Lifelong Learning enables deep learning models to perform lifelong/continual learning on few-shot data.
Our method selects very few parameters from the model for training every new set of classes instead of training the full model.
We experimentally show that our method significantly outperforms existing methods on the miniImageNet, CIFAR-100, and CUB-200 datasets.
- Score: 35.05196800623617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many real-world classification problems often have classes with very few
labeled training samples. Moreover, all possible classes may not be initially
available for training, and may be given incrementally. Deep learning models
need to deal with this two-fold problem in order to perform well in real-life
situations. In this paper, we propose a novel Few-Shot Lifelong Learning (FSLL)
method that enables deep learning models to perform lifelong/continual learning
on few-shot data. Our method selects very few parameters from the model for
training every new set of classes instead of training the full model. This
helps in preventing overfitting. We choose the few parameters from the model in
such a way that only the currently unimportant parameters get selected. By
keeping the important parameters in the model intact, our approach minimizes
catastrophic forgetting. Furthermore, we minimize the cosine similarity between
the new and the old class prototypes in order to maximize their separation,
thereby improving the classification performance. We also show that integrating
our method with self-supervision improves the model performance significantly.
We experimentally show that our method significantly outperforms existing
methods on the miniImageNet, CIFAR-100, and CUB-200 datasets. Specifically, we
outperform the state-of-the-art method by an absolute margin of 19.27% for the
CUB dataset.
Related papers
- Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion [10.322832012497722]
Class-incremental learning is a challenging problem, where the goal is to train a model that can classify data from an increasing number of classes over time.
With the advancement of vision-language pre-trained models such as CLIP, they demonstrate good generalization ability.
However, further adaptation to downstream tasks by simply fine-tuning the model leads to severe forgetting.
Most existing works with pre-trained models assume that the forgetting of old classes is uniform when the model acquires new knowledge.
arXiv Detail & Related papers (2024-07-19T09:20:33Z) - Calibrating Higher-Order Statistics for Few-Shot Class-Incremental Learning with Pre-trained Vision Transformers [12.590571371294729]
Few-shot class-incremental learning (FSCIL) aims to adapt the model to new classes from very few data (5 samples) without forgetting the previously learned classes.
Recent works in many-shot CIL (MSCIL) exploited pre-trained models to reduce forgetting and achieve better plasticity.
We use ViT models pre-trained on large-scale datasets for few-shot settings, which face the critical issue of low plasticity.
arXiv Detail & Related papers (2024-04-09T21:12:31Z) - FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained
Models in Few-Shot Learning [21.693779973263172]
In this paper, we introduce a fine-tuning approach termed Feature Discrimination Alignment (FD-Align)
Our method aims to bolster the model's generalizability by preserving the consistency of spurious features.
Once fine-tuned, the model can seamlessly integrate with existing methods, leading to performance improvements.
arXiv Detail & Related papers (2023-10-23T17:12:01Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Class-Incremental Learning with Strong Pre-trained Models [97.84755144148535]
Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes)
We explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes.
Our proposed method is robust and generalizes to all analyzed CIL settings.
arXiv Detail & Related papers (2022-04-07T17:58:07Z) - Overcoming Catastrophic Forgetting in Incremental Few-Shot Learning by
Finding Flat Minima [23.97486216731355]
This paper considers incremental few-shot learning, which requires a model to continually recognize new categories with only a few examples.
Our study shows that existing methods severely suffer from catastrophic forgetting, a well-known problem in incremental learning.
We propose to search for flat local minima of the base training objective function and then fine-tune the model parameters within the flat region on new tasks.
arXiv Detail & Related papers (2021-10-30T14:00:40Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Novelty-Prepared Few-Shot Classification [24.42397780877619]
We propose to use a novelty-prepared loss function, called self-compacting softmax loss (SSL), for few-shot classification.
In experiments on CUB-200-2011 and mini-ImageNet datasets, we show that SSL leads to significant improvement of the state-of-the-art performance.
arXiv Detail & Related papers (2020-03-01T14:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.