Generalizable Episodic Memory for Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2103.06469v1
- Date: Thu, 11 Mar 2021 05:31:21 GMT
- Title: Generalizable Episodic Memory for Deep Reinforcement Learning
- Authors: Hao Hu, Jianing Ye, Zhizhou Ren, Guangxiang Zhu, Chongjie Zhang
- Abstract summary: We propose Generalizable Episodic Memory (GEM), which effectively organizes the state-action values of episodic memory in a generalizable manner.
GEM supports implicit planning on memorized trajectories.
Empirical evaluation shows that our method significantly outperforms existing trajectory-based methods on various MuJoCo continuous control tasks.
- Score: 22.375796383623566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Episodic memory-based methods can rapidly latch onto past successful
strategies by a non-parametric memory and improve sample efficiency of
traditional reinforcement learning. However, little effort is put into the
continuous domain, where a state is never visited twice and previous episodic
methods fail to efficiently aggregate experience across trajectories. To
address this problem, we propose Generalizable Episodic Memory (GEM), which
effectively organizes the state-action values of episodic memory in a
generalizable manner and supports implicit planning on memorized trajectories.
GEM utilizes a double estimator to reduce the overestimation bias induced by
value propagation in the planning process. Empirical evaluation shows that our
method significantly outperforms existing trajectory-based methods on various
MuJoCo continuous control tasks. To further show the general applicability, we
evaluate our method on Atari games with discrete action space, which also shows
significant improvement over baseline algorithms.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Fine-Grained Gradient Restriction: A Simple Approach for Mitigating Catastrophic Forgetting [41.891312602770746]
Gradient Episodic Memory (GEM) achieves balance by utilizing a subset of past training samples to restrict the update direction of the model parameters.
We show that memory strength is effective mainly because it improves GEM's ability generalization and therefore leads to a more favorable trade-off.
arXiv Detail & Related papers (2024-10-01T17:03:56Z) - PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly
Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce.
We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD.
Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z) - Constant Memory Attention Block [74.38724530521277]
Constant Memory Attention Block (CMAB) is a novel general-purpose attention block that computes its output in constant memory and performs updates in constant computation.
We show our proposed methods achieve results competitive with state-of-the-art while being significantly more memory efficient.
arXiv Detail & Related papers (2023-06-21T22:41:58Z) - Continuous Episodic Control [7.021281655855703]
This paper introduces Continuous Episodic Control ( CEC), a novel non-parametric episodic memory algorithm for sequential decision making in problems with a continuous action space.
Results on several sparse-reward continuous control environments show that our proposed method learns faster than state-of-the-art model-free RL and memory-augmented RL algorithms, while maintaining good long-run performance as well.
arXiv Detail & Related papers (2022-11-28T09:48:42Z) - Pin the Memory: Learning to Generalize Semantic Segmentation [68.367763672095]
We present a novel memory-guided domain generalization method for semantic segmentation based on meta-learning framework.
Our method abstracts the conceptual knowledge of semantic classes into categorical memory which is constant beyond the domains.
arXiv Detail & Related papers (2022-04-07T17:34:01Z) - Sequential memory improves sample and memory efficiency in Episodic Control [0.0]
State of the art deep reinforcement learning algorithms are sample inefficient due to the large number of episodes they require to achieve performance.
ERL algorithms, inspired by the mammalian hippocampus, typically use extended memory systems to bootstrap learning from past events to overcome this sample-inefficiency problem.
Here, we demonstrate that including a bias in the acquired memory content derived from the order of episodic sampling improves both the sample and memory efficiency of an episodic control algorithm.
arXiv Detail & Related papers (2021-12-29T18:42:15Z) - Solving Continuous Control with Episodic Memory [1.9493449206135294]
Episodic memory lets reinforcement learning algorithms remember and exploit promising experience from the past to improve agent performance.
Our study aims to answer the question: can episodic memory be used to improve agent's performance in continuous control?
arXiv Detail & Related papers (2021-06-16T14:51:39Z) - A Kernel-Based Approach to Non-Stationary Reinforcement Learning in
Metric Spaces [53.47210316424326]
KeRNS is an algorithm for episodic reinforcement learning in non-stationary Markov Decision Processes.
We prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time.
arXiv Detail & Related papers (2020-07-09T21:37:13Z) - Multi-step Estimation for Gradient-based Meta-learning [3.4376560669160385]
We propose a simple yet straightforward method to reduce the cost by reusing the same gradient in a window of inner steps.
We show that our method significantly reduces training time and memory usage, maintaining competitive accuracies, or even outperforming in some cases.
arXiv Detail & Related papers (2020-06-08T00:37:01Z) - Continual Deep Learning by Functional Regularisation of Memorable Past [95.97578574330934]
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.
We propose a new functional-regularisation approach that utilises a few memorable past examples crucial to avoid forgetting.
Our method achieves state-of-the-art performance on standard benchmarks and opens a new direction for life-long learning where regularisation and memory-based methods are naturally combined.
arXiv Detail & Related papers (2020-04-29T10:47:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.