Memory Population in Continual Learning via Outlier Elimination
- URL: http://arxiv.org/abs/2207.01145v3
- Date: Tue, 3 Oct 2023 11:10:08 GMT
- Title: Memory Population in Continual Learning via Outlier Elimination
- Authors: Julio Hurtado, Alain Raymond-Saez, Vladimir Araujo, Vincenzo Lomonaco,
Alvaro Soto, Davide Bacciu
- Abstract summary: Catastrophic forgetting, the phenomenon of forgetting previously learned tasks when learning a new one, is a major hurdle in developing continual learning algorithms.
A popular method to alleviate forgetting is to use a memory buffer, which stores a subset of previously learned task examples for use during training on new tasks.
This paper introduces Memory Outlier Elimination (MOE), a method for identifying and eliminating outliers in the memory buffer by choosing samples from label-homogeneous subpopulations.
- Score: 25.511380924335207
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Catastrophic forgetting, the phenomenon of forgetting previously learned
tasks when learning a new one, is a major hurdle in developing continual
learning algorithms. A popular method to alleviate forgetting is to use a
memory buffer, which stores a subset of previously learned task examples for
use during training on new tasks. The de facto method of filling memory is by
randomly selecting previous examples. However, this process could introduce
outliers or noisy samples that could hurt the generalization of the model. This
paper introduces Memory Outlier Elimination (MOE), a method for identifying and
eliminating outliers in the memory buffer by choosing samples from
label-homogeneous subpopulations. We show that a space with a high homogeneity
is related to a feature space that is more representative of the class
distribution. In practice, MOE removes a sample if it is surrounded by samples
from different labels. We demonstrate the effectiveness of MOE on CIFAR-10,
CIFAR-100, and CORe50, outperforming previous well-known memory population
methods.
Related papers
- STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay [76.06127233986663]
Test-time adaptation (TTA) aims to address the distribution shift between the training and test data with only unlabeled data at test time.
This paper pays attention to the problem that conducts both sample recognition and outlier rejection during inference while outliers exist.
We propose a new approach called STAble Memory rePlay (STAMP), which performs optimization over a stable memory bank instead of the risky mini-batch.
arXiv Detail & Related papers (2024-07-22T16:25:41Z) - Holistic Memory Diversification for Incremental Learning in Growing Graphs [16.483780704430405]
The goal is to continually train a graph model to handle new tasks while retaining its inference ability on previous tasks.
Existing methods usually neglect the importance of memory diversity, limiting in effectively selecting high-quality memory from previous tasks.
We introduce a novel holistic Diversified Memory Selection and Generation framework for incremental learning in graphs.
arXiv Detail & Related papers (2024-06-11T16:18:15Z) - Lifelong Event Detection with Embedding Space Separation and Compaction [30.05158209938146]
Existing lifelong event detection methods typically maintain a memory module and replay the stored memory data during the learning of a new task.
The simple combination of memory data and new-task samples can still result in substantial forgetting of previously acquired knowledge.
We propose a novel method based on embedding space separation and compaction.
arXiv Detail & Related papers (2024-04-03T06:51:49Z) - Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning [14.199974986278438]
We propose SNRO, which slightly shifts the features of new classes to remember old classes.
ES decimates at a lower sample rate to build memory sets and uses to align those sparse frames in the future.
EB terminates the training at a small epoch, preventing the model from overstretching into the high-semantic space of the current task.
arXiv Detail & Related papers (2024-04-01T03:58:51Z) - What do larger image classifiers memorise? [64.01325988398838]
We show that training examples exhibit an unexpectedly diverse set of memorisation trajectories across model sizes.
We find that knowledge distillation, an effective and popular model compression technique, tends to inhibit memorisation, while also improving generalisation.
arXiv Detail & Related papers (2023-10-09T01:52:07Z) - Learning Large Scale Sparse Models [6.428186644949941]
We consider learning sparse models in large scale settings, where the number of samples and the feature dimension can grow as large as millions or billions.
We propose to learn sparse models such as Lasso in an online manner where in each, only one randomly chosen sample is revealed to update a sparse gradient.
Thereby, the memory cost is independent of the sample size and gradient evaluation for one sample is efficient.
arXiv Detail & Related papers (2023-01-26T06:29:49Z) - A Memory Transformer Network for Incremental Learning [64.0410375349852]
We study class-incremental learning, a training setup in which new classes of data are observed over time for the model to learn from.
Despite the straightforward problem formulation, the naive application of classification models to class-incremental learning results in the "catastrophic forgetting" of previously seen classes.
One of the most successful existing methods has been the use of a memory of exemplars, which overcomes the issue of catastrophic forgetting by saving a subset of past data into a memory bank and utilizing it to prevent forgetting when training future tasks.
arXiv Detail & Related papers (2022-10-10T08:27:28Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed
Recognition [95.93760490301395]
The problem of long-tailed recognition, where the number of examples per class is highly unbalanced, is considered.
It is hypothesized that this is due to the repeated sampling of examples and can be addressed by feature space augmentation.
A new feature augmentation strategy, EMANATE, based on back-tracking of features across epochs during training, is proposed.
A new sampling procedure, Breadcrumb, is then introduced to implement adversarial class-balanced sampling without extra computation.
arXiv Detail & Related papers (2021-05-01T00:21:26Z) - Improving memory banks for unsupervised learning with large mini-batch,
consistency and hard negative mining [61.223064077782645]
We introduce 3 improvements to the vanilla memory bank-based formulation which brings massive accuracy gains.
We enforce the logits obtained by different augmentations of the same sample to be close without trying to enforce discrimination with respect to negative samples.
Since instance discrimination is not meaningful for samples that are too visually similar, we devise a novel nearest neighbour approach for improving the memory bank.
arXiv Detail & Related papers (2021-02-08T18:56:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.