Online Reinforcement Learning with Passive Memory
- URL: http://arxiv.org/abs/2410.14665v1
- Date: Fri, 18 Oct 2024 17:55:15 GMT
- Title: Online Reinforcement Learning with Passive Memory
- Authors: Anay Pattanaik, Lav R. Varshney,
- Abstract summary: We show that using passive memory improves performance and provides theoretical guarantees for regret that turns out to be near-minimax optimal.
Results show that the quality of passive memory determines sub-optimality of the incurred regret.
- Score: 17.293733942245154
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal. Results show that the quality of passive memory determines sub-optimality of the incurred regret. The proposed approach and results hold in both continuous and discrete state-action spaces.
Related papers
- AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations [61.6579785305668]
AMemGym is an interactive environment enabling on-policy evaluation and optimization for memory-driven personalization.<n>Our framework provides a scalable, diagnostically rich environment for advancing memory capabilities in conversational agents.
arXiv Detail & Related papers (2026-03-02T15:15:11Z) - Memory Constrained Dynamic Subnetwork Update for Transfer Learning [20.05842386680307]
MeDyate is a theoretically-grounded framework for memory-constrained dynamic subnetwork adaptation.<n>MeDyate achieves state-of-the-art performance under extreme memory constraints.
arXiv Detail & Related papers (2025-10-23T20:16:43Z) - Distributed Associative Memory via Online Convex Optimization [42.94410959330529]
Associative memory (AM) enables cue-response recall, and associative memorization has recently been noted to underlie the operation of modern neural architectures such as Transformers.<n>This work addresses a distributed setting where agents maintain a local AM to recall their own associations as well as selective information from others.
arXiv Detail & Related papers (2025-09-26T13:20:15Z) - Escaping Stability-Plasticity Dilemma in Online Continual Learning for Motion Forecasting via Synergetic Memory Rehearsal [19.181540661354312]
We propose synergetic memory rehearsal (SyReM) for DNN-based motion forecasting.<n>SyReM maintains a compact memory buffer to represent learned knowledge.<n>It employs an inequality constraint that limits in the average loss over the memory buffer.<n>SyReM significantly mitigates catastrophic forgetting in past scenarios while improving forecasting accuracy in new ones.
arXiv Detail & Related papers (2025-08-27T05:04:33Z) - Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Revisiting Dynamic Evaluation: Online Adaptation for Large Language
Models [88.47454470043552]
We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation.
Online adaptation turns parameters into temporally changing states and provides a form of context-length extension with memory in weights.
arXiv Detail & Related papers (2024-03-03T14:03:48Z) - AdaLomo: Low-memory Optimization with Adaptive Learning Rate [59.64965955386855]
We introduce low-memory optimization with adaptive learning rate (AdaLomo) for large language models.
AdaLomo results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
arXiv Detail & Related papers (2023-10-16T09:04:28Z) - EMO: Episodic Memory Optimization for Few-Shot Meta-Learning [69.50380510879697]
episodic memory optimization for meta-learning, we call EMO, is inspired by the human ability to recall past learning experiences from the brain's memory.
EMO nudges parameter updates in the right direction, even when the gradients provided by a limited number of examples are uninformative.
EMO scales well with most few-shot classification benchmarks and improves the performance of optimization-based meta-learning methods.
arXiv Detail & Related papers (2023-06-08T13:39:08Z) - Pin the Memory: Learning to Generalize Semantic Segmentation [68.367763672095]
We present a novel memory-guided domain generalization method for semantic segmentation based on meta-learning framework.
Our method abstracts the conceptual knowledge of semantic classes into categorical memory which is constant beyond the domains.
arXiv Detail & Related papers (2022-04-07T17:34:01Z) - Offline Reinforcement Learning with Value-based Episodic Memory [19.12430651038357]
offline reinforcement learning (RL) shows promise of applying RL to real-world problems.
We propose Expectile V-Learning (EVL), which smoothly interpolates between the optimal value learning and behavior cloning.
We present a new offline method called Value-based Episodic Memory (VEM)
arXiv Detail & Related papers (2021-10-19T08:20:11Z) - Schematic Memory Persistence and Transience for Efficient and Robust
Continual Learning [8.030924531643532]
Continual learning is considered a promising step towards next-generation Artificial Intelligence (AI)
It is still quite primitive, with existing works focusing primarily on avoiding (catastrophic) forgetting.
We propose a novel framework for continual learning with external memory that builds on recent advances in neuroscience.
arXiv Detail & Related papers (2021-05-05T14:32:47Z) - Re Learning Memory Guided Normality for Anomaly Detection [0.0]
We validate the authors claim that this helps improve performance by helping the network learn patterns.
We test the efficacy with the help of t-SNE plots of the prototypical memory items.
arXiv Detail & Related papers (2021-01-29T03:28:57Z) - Online Class-Incremental Continual Learning with Adversarial Shapley
Value [28.921534209869105]
In this paper, we focus on the online class-incremental setting where a model needs to learn new classes continually from an online data stream.
To this end, we contribute a novel Adversarial Shapley value scoring method that scores memory data samples according to their ability to preserve latent decision boundaries.
Overall, we observe that our proposed ASER method provides competitive or improved performance compared to state-of-the-art replay-based continual learning methods on a variety of datasets.
arXiv Detail & Related papers (2020-08-31T20:52:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.