Mnemonics Training: Multi-Class Incremental Learning without Forgetting
- URL: http://arxiv.org/abs/2002.10211v6
- Date: Sun, 4 Apr 2021 12:24:40 GMT
- Title: Mnemonics Training: Multi-Class Incremental Learning without Forgetting
- Authors: Yaoyao Liu, Yuting Su, An-An Liu, Bernt Schiele, Qianru Sun
- Abstract summary: Multi-Class Incremental Learning (MCIL) aims to learn new concepts by incrementally updating a model trained on previous concepts.
This paper proposes a novel and automatic framework we call mnemonics, where we parameterize exemplars and make them optimizable in an end-to-end manner.
We conduct extensive experiments on three MCIL benchmarks, CIFAR-100, ImageNet-Subset and ImageNet, and show that using mnemonics exemplars can surpass the state-of-the-art by a large margin.
- Score: 131.1065577648532
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-Class Incremental Learning (MCIL) aims to learn new concepts by
incrementally updating a model trained on previous concepts. However, there is
an inherent trade-off to effectively learning new concepts without catastrophic
forgetting of previous ones. To alleviate this issue, it has been proposed to
keep around a few examples of the previous concepts but the effectiveness of
this approach heavily depends on the representativeness of these examples. This
paper proposes a novel and automatic framework we call mnemonics, where we
parameterize exemplars and make them optimizable in an end-to-end manner. We
train the framework through bilevel optimizations, i.e., model-level and
exemplar-level. We conduct extensive experiments on three MCIL benchmarks,
CIFAR-100, ImageNet-Subset and ImageNet, and show that using mnemonics
exemplars can surpass the state-of-the-art by a large margin. Interestingly and
quite intriguingly, the mnemonics exemplars tend to be on the boundaries
between different classes.
Related papers
- FILM: How can Few-Shot Image Classification Benefit from Pre-Trained
Language Models? [14.582209994281374]
Few-shot learning aims to train models that can be generalized to novel classes with only a few samples.
We propose a novel few-shot learning framework that uses pre-trained language models based on contrastive learning.
arXiv Detail & Related papers (2023-07-09T08:07:43Z) - Retrieval-Enhanced Contrastive Vision-Text Models [61.783728119255365]
We propose to equip vision-text models with the ability to refine their embedding with cross-modal retrieved information from a memory at inference time.
Remarkably, we show that this can be done with a light-weight, single-layer, fusion transformer on top of a frozen CLIP.
Our experiments validate that our retrieval-enhanced contrastive (RECO) training improves CLIP performance substantially on several challenging fine-grained tasks.
arXiv Detail & Related papers (2023-06-12T15:52:02Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - Memorizing Complementation Network for Few-Shot Class-Incremental
Learning [109.4206979528375]
We propose a Memorizing Complementation Network (MCNet) to ensemble multiple models that complements the different memorized knowledge with each other in novel tasks.
We develop a Prototype Smoothing Hard-mining Triplet (PSHT) loss to push the novel samples away from not only each other in current task but also the old distribution.
arXiv Detail & Related papers (2022-08-11T02:32:41Z) - Contrastive Learning for Prompt-Based Few-Shot Language Learners [14.244787327283335]
We present a contrastive learning framework that clusters inputs from the same class under different augmented "views"
We create different "views" of an example by appending it with different language prompts and contextual demonstrations.
Our method can improve over the state-of-the-art methods in a diverse set of 15 language tasks.
arXiv Detail & Related papers (2022-05-03T04:56:45Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Train a One-Million-Way Instance Classifier for Unsupervised Visual
Representation Learning [45.510042484456854]
This paper presents a simple unsupervised visual representation learning method with a pretext task of discriminating all images in a dataset using a parametric, instance-level computation.
The overall framework is a replica of a supervised classification model, where semantic classes (e.g., dog, bird, and ship) are replaced by instance IDs.
scaling up the classification task from thousands of semantic labels to millions of instance labels brings specific challenges including 1) the large-scale softmax classifier; 2) the slow convergence due to the infrequent visiting of instance samples; and 3) the massive number of negative classes that can be noisy.
arXiv Detail & Related papers (2021-02-09T14:44:18Z) - Example-Driven Intent Prediction with Observers [15.615065041164629]
We focus on the intent classification problem which aims to identify user intents given utterances addressed to the dialog system.
We propose two approaches for improving the generalizability of utterance classification models: (1) observers and (2) example-driven training.
arXiv Detail & Related papers (2020-10-17T01:03:06Z) - A Primal-Dual Subgradient Approachfor Fair Meta Learning [23.65344558042896]
Few shot meta-learning is well-known with its fast-adapted capability and accuracy generalization onto unseen tasks.
We propose a Primal-Dual Fair Meta-learning framework, namely PDFM, which learns to train fair machine learning models using only a few examples.
arXiv Detail & Related papers (2020-09-26T19:47:38Z) - Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning.
By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning.
This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.