Related papers: The Effectiveness of Memory Replay in Large Scale Continual Learning

The Effectiveness of Memory Replay in Large Scale Continual Learning

URL: http://arxiv.org/abs/2010.02418v1
Date: Tue, 6 Oct 2020 01:23:12 GMT
Title: The Effectiveness of Memory Replay in Large Scale Continual Learning
Authors: Yogesh Balaji, Mehrdad Farajtabar, Dong Yin, Alex Mott, Ang Li
Abstract summary: We study continual learning in the large scale setting where tasks in the input sequence are not limited to classification, and the outputs can be of high dimension. Existing methods usually replay only the input-output pairs. We propose to replay the activation of the intermediate layers in addition to the input-output pairs.
Score: 42.67483945072039
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study continual learning in the large scale setting where tasks in the input sequence are not limited to classification, and the outputs can be of high dimension. Among multiple state-of-the-art methods, we found vanilla experience replay (ER) still very competitive in terms of both performance and scalability, despite its simplicity. However, a degraded performance is observed for ER with small memory. A further visualization of the feature space reveals that the intermediate representation undergoes a distributional drift. While existing methods usually replay only the input-output pairs, we hypothesize that their regularization effect is inadequate for complex deep models and diverse tasks with small replay buffer size. Following this observation, we propose to replay the activation of the intermediate layers in addition to the input-output pairs. Considering that saving raw activation maps can dramatically increase memory and compute cost, we propose the Compressed Activation Replay technique, where compressed representations of layer activation are saved to the replay buffer. We show that this approach can achieve superior regularization effect while adding negligible memory overhead to replay method. Experiments on both the large-scale Taskonomy benchmark with a diverse set of tasks and standard common datasets (Split-CIFAR and Split-miniImageNet) demonstrate the effectiveness of the proposed method.

Related papers

SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
TEAL: New Selection Strategy for Small Buffers in Experience Replay Class Incremental Learning [7.627299398469962]
We introduce TEAL, a novel approach to populate the memory with exemplars. We show that TEAL improves the average accuracy of the SOTA method XDER as well as ER and ER-ACE on several image recognition benchmarks.
arXiv Detail & Related papers (2024-06-30T12:09:08Z)
Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection [26.948748060138264]
In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. Unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD) Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task.
arXiv Detail & Related papers (2023-07-23T20:47:03Z)
Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs [7.923088041693465]
We present an ensemble of lightweight VAEs to learn posteriors over subsets of the feature-space, which get aggregated into a joint posterior in a novel divide-and-conquer approach. We show that our approach is robust to partial features at inference, exhibiting little performance degradation even with most features missing.
arXiv Detail & Related papers (2023-06-27T17:55:31Z)
Map-based Experience Replay: A Memory-Efficient Solution to Catastrophic Forgetting in Reinforcement Learning [15.771773131031054]
Deep Reinforcement Learning agents often suffer from catastrophic forgetting, forgetting previously found solutions in parts of the input space when training on new data. We introduce a novel cognitive-inspired replay memory approach based on the Grow-When-Required (GWR) self-organizing network. Our approach organizes stored transitions into a concise environment-model-like network of state-nodes and transition-edges, merging similar samples to reduce the memory size and increase pair-wise distance among samples.
arXiv Detail & Related papers (2023-05-03T11:39:31Z)
Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision. We show that it is equally important to ensure that the accumulated embeddings are up to date. In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z)
ACAE-REMIND for Online Continual Learning with Compressed Feature Replay [47.73014647702813]
We propose an auxiliary classifier auto-encoder (ACAE) module for feature replay at intermediate layers with high compression rates. The reduced memory footprint per image allows us to save more exemplars for replay. In our experiments, we conduct task-agnostic evaluation under online continual learning setting.
arXiv Detail & Related papers (2021-05-18T15:27:51Z)
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER) SEER is a simple modification of existing off-policy deep reinforcement learning methods. We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z)
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition [86.31412529187243]
Few-shot video recognition aims at learning new actions with only very few labeled samples. We propose a depth guided Adaptive Meta-Fusion Network for few-shot video recognition which is termed as AMeFu-Net.
arXiv Detail & Related papers (2020-10-20T03:06:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.