Learning Visual Planning Models from Partially Observed Images
- URL: http://arxiv.org/abs/2211.15666v1
- Date: Fri, 25 Nov 2022 07:00:56 GMT
- Title: Learning Visual Planning Models from Partially Observed Images
- Authors: Kebing Jin, Zhanhao Xiao, Hankui Hankz Zhuo, Hai Wan, Jiaran Cai
- Abstract summary: We provide a novel framework, aTypeRecplan, for learning a transition model from partially observed raw image traces.
We also propose a neural-network-based approach to learn a model that estimates the distance toward a given goal observation.
Our approach is more effective than a state-of-the-art approach of learning visual planning models in the environment with incomplete observations.
- Score: 17.25694427734666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been increasing attention on planning model learning in classical
planning. Most existing approaches, however, focus on learning planning models
from structured data in symbolic representations. It is often difficult to
obtain such structured data in real-world scenarios. Although a number of
approaches have been developed for learning planning models from fully observed
unstructured data (e.g., images), in many scenarios raw observations are often
incomplete. In this paper, we provide a novel framework, \aType{Recplan}, for
learning a transition model from partially observed raw image traces. More
specifically, by considering the preceding and subsequent images in a trace, we
learn the latent state representations of raw observations and then build a
transition model based on such representations. Additionally, we propose a
neural-network-based approach to learn a heuristic model that estimates the
distance toward a given goal observation. Based on the learned transition model
and heuristic model, we implement a classical planner for images. We exhibit
empirically that our approach is more effective than a state-of-the-art
approach of learning visual planning models in the environment with incomplete
observations.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Sequential Modeling Enables Scalable Learning for Large Vision Models [120.91839619284431]
We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data.
We define a common format, "visual sentences", in which we can represent raw images and videos as well as annotated data sources.
arXiv Detail & Related papers (2023-12-01T18:59:57Z) - Compositional Foundation Models for Hierarchical Planning [52.18904315515153]
We propose a foundation model which leverages expert foundation model trained on language, vision and action data individually together to solve long-horizon tasks.
We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model.
Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos.
arXiv Detail & Related papers (2023-09-15T17:44:05Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z) - Visual Learning-based Planning for Continuous High-Dimensional POMDPs [81.16442127503517]
Visual Tree Search (VTS) is a learning and planning procedure that combines generative models learned offline with online model-based POMDP planning.
VTS bridges offline model training and online planning by utilizing a set of deep generative observation models to predict and evaluate the likelihood of image observations in a Monte Carlo tree search planner.
We show that VTS is robust to different observation noises and, since it utilizes online, model-based planning, can adapt to different reward structures without the need to re-train.
arXiv Detail & Related papers (2021-12-17T11:53:31Z) - Recommending Metamodel Concepts during Modeling Activities with
Pre-Trained Language Models [0.0]
We propose an approach to assist a modeler in the design of a metamodel by recommending relevant domain concepts in several modeling scenarios.
Our approach does not require to extract knowledge from the domain or to hand-design completion rules.
We evaluate our approach on a test set containing 166 metamodels, unseen during the model training, with more than 5000 test samples.
arXiv Detail & Related papers (2021-04-04T16:29:10Z) - Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.