Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning
- URL: http://arxiv.org/abs/2405.11829v1
- Date: Mon, 20 May 2024 06:56:43 GMT
- Title: Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning
- Authors: Hikmat Khan, Ghulam Rasool, Nidhal Carla Bouaynaya,
- Abstract summary: Continual learning focuses on learning non-stationary data distribution without forgetting previous knowledge.
Rehearsal-based approaches are commonly used to combat catastrophic forgetting.
We introduce the Adversarially Diversified Rehearsal Memory to address the memory overfitting challenge.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual learning focuses on learning non-stationary data distribution without forgetting previous knowledge. Rehearsal-based approaches are commonly used to combat catastrophic forgetting. However, these approaches suffer from a problem called "rehearsal memory overfitting, " where the model becomes too specialized on limited memory samples and loses its ability to generalize effectively. As a result, the effectiveness of the rehearsal memory progressively decays, ultimately resulting in catastrophic forgetting of the learned tasks. We introduce the Adversarially Diversified Rehearsal Memory (ADRM) to address the memory overfitting challenge. This novel method is designed to enrich memory sample diversity and bolster resistance against natural and adversarial noise disruptions. ADRM employs the FGSM attacks to introduce adversarially modified memory samples, achieving two primary objectives: enhancing memory diversity and fostering a robust response to continual feature drifts in memory samples. Our contributions are as follows: Firstly, ADRM addresses overfitting in rehearsal memory by employing FGSM to diversify and increase the complexity of the memory buffer. Secondly, we demonstrate that ADRM mitigates memory overfitting and significantly improves the robustness of CL models, which is crucial for safety-critical applications. Finally, our detailed analysis of features and visualization demonstrates that ADRM mitigates feature drifts in CL memory samples, significantly reducing catastrophic forgetting and resulting in a more resilient CL model. Additionally, our in-depth t-SNE visualizations of feature distribution and the quantification of the feature similarity further enrich our understanding of feature representation in existing CL approaches. Our code is publically available at https://github.com/hikmatkhan/ADRM.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - MsMemoryGAN: A Multi-scale Memory GAN for Palm-vein Adversarial Purification [40.80205521005344]
We propose a novel defense model named MsMemoryGAN to filter the perturbations from adversarial samples before recognition.
MsMemoryGAN learns to reconstruct the input by merely using fewer prototypical elements of the normal patterns recorded in the memory.
Our approach removes a wide variety of adversarial perturbations, allowing vein classifiers to achieve the highest recognition accuracy.
arXiv Detail & Related papers (2024-08-20T09:46:30Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - What do larger image classifiers memorise? [64.01325988398838]
We show that training examples exhibit an unexpectedly diverse set of memorisation trajectories across model sizes.
We find that knowledge distillation, an effective and popular model compression technique, tends to inhibit memorisation, while also improving generalisation.
arXiv Detail & Related papers (2023-10-09T01:52:07Z) - Saliency-Guided Hidden Associative Replay for Continual Learning [13.551181595881326]
Continual Learning is a burgeoning domain in next-generation AI, focusing on training neural networks over a sequence of tasks akin to human learning.
This paper presents the Saliency Guided Hidden Associative Replay for Continual Learning.
This novel framework synergizes associative memory with replay-based strategies. SHARC primarily archives salient data segments via sparse memory encoding.
arXiv Detail & Related papers (2023-10-06T15:54:12Z) - Analysis of the Memorization and Generalization Capabilities of AI
Agents: Are Continual Learners Robust? [91.682459306359]
In continual learning (CL), an AI agent learns from non-stationary data streams under dynamic environments.
In this paper, a novel CL framework is proposed to achieve robust generalization to dynamic environments while retaining past knowledge.
The generalization and memorization performance of the proposed framework are theoretically analyzed.
arXiv Detail & Related papers (2023-09-18T21:00:01Z) - Improving Task-free Continual Learning by Distributionally Robust Memory
Evolution [9.345559196495746]
Task-free continual learning aims to learn a non-stationary data stream without explicit task definitions and not forget previous knowledge.
Existing methods overlook the high uncertainty in the memory data distribution.
We propose a principled memory evolution framework to dynamically evolve the memory data distribution.
arXiv Detail & Related papers (2022-07-15T02:16:09Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Memory-Guided Semantic Learning Network for Temporal Sentence Grounding [55.31041933103645]
We propose a memory-augmented network that learns and memorizes the rarely appeared content in TSG tasks.
MGSL-Net consists of three main parts: a cross-modal inter-action module, a memory augmentation module, and a heterogeneous attention module.
arXiv Detail & Related papers (2022-01-03T02:32:06Z) - Sequential memory improves sample and memory efficiency in Episodic Control [0.0]
State of the art deep reinforcement learning algorithms are sample inefficient due to the large number of episodes they require to achieve performance.
ERL algorithms, inspired by the mammalian hippocampus, typically use extended memory systems to bootstrap learning from past events to overcome this sample-inefficiency problem.
Here, we demonstrate that including a bias in the acquired memory content derived from the order of episodic sampling improves both the sample and memory efficiency of an episodic control algorithm.
arXiv Detail & Related papers (2021-12-29T18:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.