A Memory Transformer Network for Incremental Learning
- URL: http://arxiv.org/abs/2210.04485v1
- Date: Mon, 10 Oct 2022 08:27:28 GMT
- Title: A Memory Transformer Network for Incremental Learning
- Authors: Ahmet Iscen, Thomas Bird, Mathilde Caron, Alireza Fathi, Cordelia
Schmid
- Abstract summary: We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
- Score: 64.0410375349852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study class-incremental learning, a training setup in which new classes of
data are observed over time for the model to learn from. Despite the
straightforward problem formulation, the naive application of classification
models to class-incremental learning results in the "catastrophic forgetting"
of previously seen classes. One of the most successful existing methods has
been the use of a memory of exemplars, which overcomes the issue of
catastrophic forgetting by saving a subset of past data into a memory bank and
utilizing it to prevent forgetting when training future tasks. In our paper, we
propose to enhance the utilization of this memory bank: we not only use it as a
source of additional training data like existing works but also integrate it in
the prediction process explicitly.Our method, the Memory Transformer Network
(MTN), learns how to combine and aggregate the information from the nearest
neighbors in the memory with a transformer to make more accurate predictions.
We conduct extensive experiments and ablations to evaluate our approach. We
show that MTN achieves state-of-the-art performance on the challenging
ImageNet-1k and Google-Landmarks-1k incremental learning benchmarks.
Related papers
- Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token [0.6144680854063939]
Deep learning models display catastrophic forgetting when learning new data continuously.
We present a novel method that preserves previous knowledge without storing previous data.
This method is inspired by the architecture of a vision transformer and employs a unique token capable of encapsulating the compressed knowledge of each task.
arXiv Detail & Related papers (2024-11-06T16:13:50Z) - Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach [87.8330887605381]
We show how to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters.
We synthesize a task-specific query with a learnable and lightweight module, which is independent of the pre-trained model.
Our method achieves state-of-the-art performance under memory constraints, showcasing its applicability in real-world situations.
arXiv Detail & Related papers (2024-07-09T15:45:04Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Adaptive Memory Replay for Continual Learning [29.333341368722653]
Updating Foundation Models as new data becomes available can lead to catastrophic forgetting'
We introduce a framework of adaptive memory replay for continual learning, where sampling of past data is phrased as a multi-armed bandit problem.
We demonstrate the effectiveness of our approach, which maintains high performance while reducing forgetting by up to 10% at no training efficiency cost.
arXiv Detail & Related papers (2024-04-18T22:01:56Z) - Retrieval-Enhanced Contrastive Vision-Text Models [61.783728119255365]
We propose to equip vision-text models with the ability to refine their embedding with cross-modal retrieved information from a memory at inference time.
Remarkably, we show that this can be done with a light-weight, single-layer, fusion transformer on top of a frozen CLIP.
Our experiments validate that our retrieval-enhanced contrastive (RECO) training improves CLIP performance substantially on several challenging fine-grained tasks.
arXiv Detail & Related papers (2023-06-12T15:52:02Z) - Continual Learning with Strong Experience Replay [32.154995019080594]
We propose a CL method with Strong Experience Replay (SER)
SER utilizes future experiences mimicked on the current training data, besides distilling past experience from the memory buffer.
Experimental results on multiple image classification datasets show that our SER method surpasses the state-of-the-art methods by a noticeable margin.
arXiv Detail & Related papers (2023-05-23T02:42:54Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - Representation Memorization for Fast Learning New Knowledge without
Forgetting [36.55736909586313]
The ability to quickly learn new knowledge is a big step towards human-level intelligence.
We consider scenarios that require learning new classes or data distributions quickly and incrementally over time.
We propose "Memory-based Hebbian Adaptation" to tackle the two major challenges.
arXiv Detail & Related papers (2021-08-28T07:54:53Z) - ZS-IL: Looking Back on Learned ExperiencesFor Zero-Shot Incremental
Learning [9.530976792843495]
We propose an on-call transfer set to provide past experiences whenever a new class arises in the data stream.
ZS-IL demonstrates significantly better performance on the well-known datasets (CIFAR-10, Tiny-ImageNet) in both Task-IL and Class-IL settings.
arXiv Detail & Related papers (2021-03-22T22:43:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.