Continuous Episodic Control
- URL: http://arxiv.org/abs/2211.15183v3
- Date: Sun, 23 Apr 2023 09:21:14 GMT
- Title: Continuous Episodic Control
- Authors: Zhao Yang, Thomas M. Moerland, Mike Preuss, Aske Plaat
- Abstract summary: This paper introduces Continuous Episodic Control ( CEC), a novel non-parametric episodic memory algorithm for sequential decision making in problems with a continuous action space.
Results on several sparse-reward continuous control environments show that our proposed method learns faster than state-of-the-art model-free RL and memory-augmented RL algorithms, while maintaining good long-run performance as well.
- Score: 7.021281655855703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Non-parametric episodic memory can be used to quickly latch onto
high-rewarded experience in reinforcement learning tasks. In contrast to
parametric deep reinforcement learning approaches in which reward signals need
to be back-propagated slowly, these methods only need to discover the solution
once, and may then repeatedly solve the task. However, episodic control
solutions are stored in discrete tables, and this approach has so far only been
applied to discrete action space problems. Therefore, this paper introduces
Continuous Episodic Control (CEC), a novel non-parametric episodic memory
algorithm for sequential decision making in problems with a continuous action
space. Results on several sparse-reward continuous control environments show
that our proposed method learns faster than state-of-the-art model-free RL and
memory-augmented RL algorithms, while maintaining good long-run performance as
well. In short, CEC can be a fast approach for learning in continuous control
tasks.
Related papers
- Continuous Control with Coarse-to-fine Reinforcement Learning [15.585706638252441]
We present a framework that trains RL agents to zoom-into a continuous action space in a coarse-to-fine manner.
We introduce a concrete, value-based algorithm within the framework called Coarse-to-fine Q-Network (CQN)
CQN robustly learns to solve real-world manipulation tasks within a few minutes of online training.
arXiv Detail & Related papers (2024-07-10T16:04:08Z) - MOSEAC: Streamlined Variable Time Step Reinforcement Learning [14.838483990647697]
We introduce the Multi-Objective Soft Elastic Actor-Critic (MOSEAC) method.
MOSEAC features an adaptive reward scheme based on observed trends in task rewards during training.
We validate the MOSEAC method through simulations in a Newtonian kinematics environment.
arXiv Detail & Related papers (2024-06-03T16:51:57Z) - Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Solving Continuous Control via Q-learning [54.05120662838286]
We show that a simple modification of deep Q-learning largely alleviates issues with actor-critic methods.
By combining bang-bang action discretization with value decomposition, framing single-agent control as cooperative multi-agent reinforcement learning (MARL), this simple critic-only approach matches performance of state-of-the-art continuous actor-critic methods.
arXiv Detail & Related papers (2022-10-22T22:55:50Z) - Continual Learning with Guarantees via Weight Interval Constraints [18.791232422083265]
We introduce a new training paradigm that enforces interval constraints on neural network parameter space to control forgetting.
We show how to put bounds on forgetting by reformulating continual learning of a model as a continual contraction of its parameter space.
arXiv Detail & Related papers (2022-06-16T08:28:37Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Learning Memory-Dependent Continuous Control from Demonstrations [13.063093054280948]
This paper builds on the idea of replaying demonstrations for memory-dependent continuous control.
Experiments involving several memory-crucial continuous control tasks reveal significantly reduce interactions with the environment.
The algorithm also shows better sample efficiency and learning capabilities than a baseline reinforcement learning algorithm for memory-based control from demonstrations.
arXiv Detail & Related papers (2021-02-18T08:13:42Z) - Episodic Self-Imitation Learning with Hindsight [7.743320290728377]
Episodic self-imitation learning is a novel self-imitation algorithm with a trajectory selection module and an adaptive loss function.
A selection module is introduced to filter uninformative samples from each episode of the update.
Episodic self-imitation learning has the potential to be applied to real-world problems that have continuous action spaces.
arXiv Detail & Related papers (2020-11-26T20:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.