Related papers: Studying Generalization on Memory-Based Methods in Continual Learning

Studying Generalization on Memory-Based Methods in Continual Learning

URL: http://arxiv.org/abs/2306.09890v2
Date: Tue, 20 Jun 2023 13:47:17 GMT
Title: Studying Generalization on Memory-Based Methods in Continual Learning
Authors: Felipe del Rio, Julio Hurtado, Cristian Buc, Alvaro Soto and Vincenzo Lomonaco
Abstract summary: Memory-based methods store a percentage of previous data distributions to be used during training. We show that these methods can help in traditional in-distribution generalization, but can strongly impair out-of-distribution generalization by learning spurious features and correlations.
Score: 9.896917981912106
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the objectives of Continual Learning is to learn new concepts continually over a stream of experiences and at the same time avoid catastrophic forgetting. To mitigate complete knowledge overwriting, memory-based methods store a percentage of previous data distributions to be used during training. Although these methods produce good results, few studies have tested their out-of-distribution generalization properties, as well as whether these methods overfit the replay memory. In this work, we show that although these methods can help in traditional in-distribution generalization, they can strongly impair out-of-distribution generalization by learning spurious features and correlations. Using a controlled environment, the Synbol benchmark generator (Lacoste et al., 2020), we demonstrate that this lack of out-of-distribution generalization mainly occurs in the linear classifier.

Related papers

Bigger Isn't Always Memorizing: Early Stopping Overparameterized Diffusion Models [51.03144354630136]
Generalization in natural data domains is progressively achieved during training before the onset of memorization.<n>Generalization vs. memorization is then best understood as a competition between time scales.<n>We show that this phenomenology is recovered in diffusion models learning a simple probabilistic context-free grammar with random rules.
arXiv Detail & Related papers (2025-05-22T17:40:08Z)
A Mathematics Framework of Artificial Shifted Population Risk and Its Further Understanding Related to Consistency Regularization [7.944280447232545]
This paper introduces a more comprehensive mathematical framework for data augmentation. We establish that the expected risk of the shifted population is the sum of the original population risk and a gap term. The paper also provides a theoretical understanding of this gap, highlighting its negative effects on the early stages of training.
arXiv Detail & Related papers (2025-02-15T08:26:49Z)
Contrastive Continual Learning with Importance Sampling and Prototype-Instance Relation Distillation [14.25441464051506]
We propose Contrastive Continual Learning via Importance Sampling (CCLIS) to preserve knowledge by recovering previous data distributions. We also present the Prototype-instance Relation Distillation (PRD) loss, a technique designed to maintain the relationship between prototypes and sample representations.
arXiv Detail & Related papers (2024-03-07T15:47:52Z)
Few-Shot Class-Incremental Learning with Prior Knowledge [94.95569068211195]
We propose Learning with Prior Knowledge (LwPK) to enhance the generalization ability of the pre-trained model. Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting.
arXiv Detail & Related papers (2024-02-02T08:05:35Z)
Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning. We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL) Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z)
Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks. To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks. However, it is not expected in practice considering the memory constraint or data privacy issue. As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z)
Learned reconstruction methods for inverse problems: sample error estimates [0.8702432681310401]
This dissertation addresses the generalization properties of learned reconstruction methods, and specifically to perform their sample error analysis. A rather general strategy is proposed, whose assumptions are met for a large class of inverse problems and learned methods.
arXiv Detail & Related papers (2023-12-21T17:56:19Z)
Gradient-Matching Coresets for Rehearsal-Based Continual Learning [6.243028964381449]
The goal of continual learning (CL) is to efficiently update a machine learning model with new data without forgetting previously-learned knowledge. Most widely-used CL methods rely on a rehearsal memory of data points to be reused while training on new data. We devise a coreset selection method for rehearsal-based continual learning.
arXiv Detail & Related papers (2022-03-28T07:37:17Z)
Continually Learning Self-Supervised Representations with Projected Functional Regularization [39.92600544186844]
Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised methods. These methods are unable to acquire new knowledge incrementally -- they are, in fact, mostly used only as a pre-training phase with IID data. To prevent forgetting of previous knowledge, we propose the usage of functional regularization.
arXiv Detail & Related papers (2021-12-30T11:59:23Z)
Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning [80.20302993614594]
We provide a statistical analysis to overcome drawbacks of Laplacian regularization. We unveil a large body of spectral filtering methods that exhibit desirable behaviors. We provide realistic computational guidelines in order to make our method usable with large amounts of data.
arXiv Detail & Related papers (2020-09-09T14:28:54Z)
Automatic Recall Machines: Internal Replay, Continual Learning and the Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity. We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective. Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
Continual Deep Learning by Functional Regularisation of Memorable Past [95.97578574330934]
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. We propose a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting. Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.
arXiv Detail & Related papers (2020-04-29T10:47:54Z)
AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks [12.14537824884951]
We propose a novel regularization method that progressively penalizes the magnitude of activations during training. Our method's effect on generalization is analyzed with label randomization tests and cumulative ablations.
arXiv Detail & Related papers (2020-03-07T18:38:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.