Virtual Replay Cache
- URL: http://arxiv.org/abs/2112.03421v1
- Date: Mon, 6 Dec 2021 23:40:27 GMT
- Title: Virtual Replay Cache
- Authors: Brett Daley and Christopher Amato
- Abstract summary: We propose a new data structure, the Virtual Replay Cache (VRC), to address these shortcomings.
VRC nearly eliminates DQN(lambda)'s cache memory footprint and slightly reduces the total training time on our hardware.
- Score: 20.531576904743282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Return caching is a recent strategy that enables efficient minibatch training
with multistep estimators (e.g. the {\lambda}-return) for deep reinforcement
learning. By precomputing return estimates in sequential batches and then
storing the results in an auxiliary data structure for later sampling, the
average computation spent per estimate can be greatly reduced. Still, the
efficiency of return caching could be improved, particularly with regard to its
large memory usage and repetitive data copies. We propose a new data structure,
the Virtual Replay Cache (VRC), to address these shortcomings. When learning to
play Atari 2600 games, the VRC nearly eliminates DQN({\lambda})'s cache memory
footprint and slightly reduces the total training time on our hardware.
Related papers
- Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization [36.251000184801576]
Retro has been shown to improve language modeling capabilities and reduce toxicity and hallucinations by retrieving from a database of non-parametric memory containing trillions of entries.
We introduce Retro-li that shows retrieval can also help using a small-scale database, but it demands more accurate and better neighbors when searching in a smaller hence sparser non-parametric memory.
We show that Retro-li's non-parametric memory can potentially be implemented on analog in-memory computing hardware, exhibiting O(1) search time while causing noise in retrieving neighbors, with minimal (1%) performance loss.
arXiv Detail & Related papers (2024-09-12T23:29:33Z) - PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference [57.53291046180288]
Large Language Models (LLMs) have shown remarkable comprehension abilities but face challenges in GPU memory usage during inference.
We propose PyramidInfer, a method that compresses the KV cache by layer-wise retaining crucial context.
PyramidInfer improves 2.2x throughput compared to Accelerate with over 54% GPU memory reduction in KV cache.
arXiv Detail & Related papers (2024-05-21T06:46:37Z) - Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance.
Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z) - Improving information retention in large scale online continual learning [99.73847522194549]
Online continual learning aims to adapt efficiently to new data while retaining existing knowledge.
Recent work suggests that information retention remains a problem in large scale OCL even when the replay buffer is unlimited.
We propose using a moving average family of methods to improve optimization for non-stationary objectives.
arXiv Detail & Related papers (2022-10-12T16:59:43Z) - Memory-efficient Reinforcement Learning with Value-based Knowledge
Consolidation [14.36005088171571]
We propose memory-efficient reinforcement learning algorithms based on the deep Q-network algorithm.
Our algorithms reduce forgetting and maintain high sample efficiency by consolidating knowledge from the target Q-network to the current Q-network.
arXiv Detail & Related papers (2022-05-22T17:02:51Z) - Memory Replay with Data Compression for Continual Learning [80.95444077825852]
We propose memory replay with data compression to reduce the storage cost of old training samples.
We extensively validate this across several benchmarks of class-incremental learning and in a realistic scenario of object detection for autonomous driving.
arXiv Detail & Related papers (2022-02-14T10:26:23Z) - Mesa: A Memory-saving Training Framework for Transformers [58.78933015299703]
We present Mesa, a memory-saving training framework for Transformers.
Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training.
Experiments on ImageNet, CIFAR-100 and ADE20K demonstrate that Mesa can reduce half of the memory footprints during training.
arXiv Detail & Related papers (2021-11-22T11:23:01Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Improving compute efficacy frontiers with SliceOut [31.864949424541344]
We introduce SliceOut -- a dropout-inspired scheme to train deep learning models faster without impacting final test accuracy.
At test time, turning off SliceOut performs an implicit ensembling across a linear number of architectures that preserves test accuracy.
This leads to faster processing of large computational workloads overall, and significantly reduce the resulting energy consumption and CO2emissions.
arXiv Detail & Related papers (2020-07-21T15:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.