Visual Learning-based Planning for Continuous High-Dimensional POMDPs
- URL: http://arxiv.org/abs/2112.09456v1
- Date: Fri, 17 Dec 2021 11:53:31 GMT
- Title: Visual Learning-based Planning for Continuous High-Dimensional POMDPs
- Authors: Sampada Deglurkar, Michael H. Lim, Johnathan Tucker, Zachary N.
Sunberg, Aleksandra Faust, Claire J. Tomlin
- Abstract summary: Visual Tree Search (VTS) is a learning and planning procedure that combines generative models learned offline with online model-based POMDP planning.
VTS bridges offline model training and online planning by utilizing a set of deep generative observation models to predict and evaluate the likelihood of image observations in a Monte Carlo tree search planner.
We show that VTS is robust to different observation noises and, since it utilizes online, model-based planning, can adapt to different reward structures without the need to re-train.
- Score: 81.16442127503517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Partially Observable Markov Decision Process (POMDP) is a powerful
framework for capturing decision-making problems that involve state and
transition uncertainty. However, most current POMDP planners cannot effectively
handle very high-dimensional observations they often encounter in the real
world (e.g. image observations in robotic domains). In this work, we propose
Visual Tree Search (VTS), a learning and planning procedure that combines
generative models learned offline with online model-based POMDP planning. VTS
bridges offline model training and online planning by utilizing a set of deep
generative observation models to predict and evaluate the likelihood of image
observations in a Monte Carlo tree search planner. We show that VTS is robust
to different observation noises and, since it utilizes online, model-based
planning, can adapt to different reward structures without the need to
re-train. This new approach outperforms a baseline state-of-the-art on-policy
planning algorithm while using significantly less offline training time.
Related papers
- A New View on Planning in Online Reinforcement Learning [19.35031543927374]
This paper investigates a new approach to model-based reinforcement learning using background planning.
We show that our GSP algorithm can propagate value from an abstract space in a manner that helps a variety of base learners learn significantly faster in different domains.
arXiv Detail & Related papers (2024-06-03T17:45:19Z) - Learning Logic Specifications for Policy Guidance in POMDPs: an
Inductive Logic Programming Approach [57.788675205519986]
We learn high-quality traces from POMDP executions generated by any solver.
We exploit data- and time-efficient Indu Logic Programming (ILP) to generate interpretable belief-based policy specifications.
We show that learneds expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specifics within lower computational time.
arXiv Detail & Related papers (2024-02-29T15:36:01Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - PDSketch: Integrated Planning Domain Programming and Learning [86.07442931141637]
We present a new domain definition language, named PDSketch.
It allows users to flexibly define high-level structures in the transition models.
Details of the transition model will be filled in by trainable neural networks.
arXiv Detail & Related papers (2023-03-09T18:54:12Z) - Online learning techniques for prediction of temporal tabular datasets
with regime changes [0.0]
We propose a modular machine learning pipeline for ranking predictions on temporal panel datasets.
The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks.
Online learning techniques, which require no retraining of models, can be used post-prediction to enhance the results.
arXiv Detail & Related papers (2022-12-30T17:19:00Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Predictive Control Using Learned State Space Models via Rolling Horizon
Evolution [2.1016374925364616]
In this paper, we explore this theme combining evolutionary algorithmic planning techniques with models learned via deep learning and variational inference.
We demonstrate the approach with an agent that reliably performs online planning in a set of visual navigation tasks.
arXiv Detail & Related papers (2021-06-25T23:23:42Z) - Hallucinative Topological Memory for Zero-Shot Visual Planning [86.20780756832502]
In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline.
Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans.
Here, we propose a simple VP method that plans directly in image space and displays competitive performance.
arXiv Detail & Related papers (2020-02-27T18:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.