Dream and Search to Control: Latent Space Planning for Continuous
Control
- URL: http://arxiv.org/abs/2010.09832v1
- Date: Mon, 19 Oct 2020 20:10:51 GMT
- Title: Dream and Search to Control: Latent Space Planning for Continuous
Control
- Authors: Anurag Koul, Varun V. Kumar, Alan Fern, Somdeb Majumdar
- Abstract summary: We show that it is possible to demonstrate the types of bootstrapping benefits as previously shown for discrete spaces.
In particular, the approach achieves improved sample efficiency and performance on a majority of challenging continuous-control benchmarks.
- Score: 24.991127785736364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning and planning with latent space dynamics has been shown to be useful
for sample efficiency in model-based reinforcement learning (MBRL) for discrete
and continuous control tasks. In particular, recent work, for discrete action
spaces, demonstrated the effectiveness of latent-space planning via Monte-Carlo
Tree Search (MCTS) for bootstrapping MBRL during learning and at test time.
However, the potential gains from latent-space tree search have not yet been
demonstrated for environments with continuous action spaces. In this work, we
propose and explore an MBRL approach for continuous action spaces based on
tree-based planning over learned latent dynamics. We show that it is possible
to demonstrate the types of bootstrapping benefits as previously shown for
discrete spaces. In particular, the approach achieves improved sample
efficiency and performance on a majority of challenging continuous-control
benchmarks compared to the state-of-the-art.
Related papers
- Mamba-CL: Optimizing Selective State Space Model in Null Space for Continual Learning [54.19222454702032]
Continual Learning aims to equip AI models with the ability to learn a sequence of tasks over time, without forgetting previously learned knowledge.
State Space Models (SSMs) have achieved notable success in computer vision.
We introduce Mamba-CL, a framework that continuously fine-tunes the core SSMs of the large-scale Mamba foundation model.
arXiv Detail & Related papers (2024-11-23T06:36:16Z) - Efficient Planning with Latent Diffusion [18.678459478837976]
Temporal abstraction and efficient planning pose significant challenges in offline reinforcement learning.
Latent action spaces offer a more flexible paradigm, capturing only possible actions within the behavior policy support.
This paper presents a unified framework for continuous latent action space representation learning and planning by leveraging latent, score-based diffusion models.
arXiv Detail & Related papers (2023-09-30T08:50:49Z) - Sample-efficient Real-time Planning with Curiosity Cross-Entropy Method
and Contrastive Learning [21.995159117991278]
We propose Curiosity CEM, an improved version of the Cross-Entropy Method (CEM) algorithm for encouraging exploration via curiosity.
Our proposed method maximizes the sum of state-action Q values over the planning horizon, in which these Q values estimate the future extrinsic and intrinsic reward.
Experiments on image-based continuous control tasks from the DeepMind Control suite show that CCEM is by a large margin more sample-efficient than previous MBRL algorithms.
arXiv Detail & Related papers (2023-03-07T10:48:20Z) - Adaptive Discretization using Voronoi Trees for Continuous POMDPs [7.713622698801596]
We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
It uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization to efficiently sample high-dimensional continuous action spaces.
ADVT scales substantially better to high-dimensional continuous action spaces, compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-02-21T04:47:34Z) - Leveraging Demonstrations with Latent Space Priors [90.56502305574665]
We propose to leverage demonstration datasets by combining skill learning and sequence modeling.
We show how to acquire such priors from state-only motion capture demonstrations and explore several methods for integrating them into policy learning.
Our experimental results confirm that latent space priors provide significant gains in learning speed and final performance in a set of challenging sparse-reward environments.
arXiv Detail & Related papers (2022-10-26T13:08:46Z) - Continuous Monte Carlo Graph Search [61.11769232283621]
Continuous Monte Carlo Graph Search ( CMCGS) is an extension of Monte Carlo Tree Search (MCTS) to online planning.
CMCGS takes advantage of the insight that, during planning, sharing the same action policy between several states can yield high performance.
It can be scaled up through parallelization, and it outperforms the Cross-Entropy Method (CEM) in continuous control with learned dynamics models.
arXiv Detail & Related papers (2022-10-04T07:34:06Z) - Adaptive Discretization using Voronoi Trees for Continuous-Action POMDPs [7.713622698801596]
We propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT)
ADVT uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization.
Experiments on simulations of four types of benchmark problems indicate that ADVT outperforms and scales substantially better to high-dimensional continuous action spaces.
arXiv Detail & Related papers (2022-09-13T05:04:49Z) - Transfer RL across Observation Feature Spaces via Model-Based
Regularization [9.660642248872973]
In many reinforcement learning (RL) applications, the observation space is specified by human developers and restricted by physical realizations.
We propose a novel algorithm which extracts the latent-space dynamics in the source task, and transfers the dynamics model to the target task.
Our algorithm works for drastic changes of observation space without any inter-task mapping or any prior knowledge of the target task.
arXiv Detail & Related papers (2022-01-01T22:41:19Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Model-Based Reinforcement Learning via Latent-Space Collocation [110.04005442935828]
We argue that it is easier to solve long-horizon tasks by planning sequences of states rather than just actions.
We adapt the idea of collocation, which has shown good results on long-horizon tasks in optimal control literature, to the image-based setting by utilizing learned latent state space models.
arXiv Detail & Related papers (2021-06-24T17:59:18Z) - Predictive Coding for Locally-Linear Control [92.35650774524399]
High-dimensional observations and unknown dynamics are major challenges when applying optimal control to many real-world decision making tasks.
The Learning Controllable Embedding (LCE) framework addresses these challenges by embedding the observations into a lower dimensional latent space.
We show theoretically that explicit next-observation prediction can be replaced with predictive coding.
arXiv Detail & Related papers (2020-03-02T18:20:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.