Latent Space Roadmap for Visual Action Planning of Deformable and Rigid
Object Manipulation
- URL: http://arxiv.org/abs/2003.08974v1
- Date: Thu, 19 Mar 2020 18:43:26 GMT
- Title: Latent Space Roadmap for Visual Action Planning of Deformable and Rigid
Object Manipulation
- Authors: Martina Lippi, Petra Poklukar, Michael C. Welle, Anastasiia Varava,
Hang Yin, Alessandro Marino and Danica Kragic
- Abstract summary: Planning is performed in a low-dimensional latent state space that embeds images.
Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them.
- Score: 74.88956115580388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a framework for visual action planning of complex manipulation
tasks with high-dimensional state spaces such as manipulation of deformable
objects. Planning is performed in a low-dimensional latent state space that
embeds images. We define and implement a Latent Space Roadmap (LSR) which is a
graph-based structure that globally captures the latent system dynamics. Our
framework consists of two main components: a Visual Foresight Module (VFM) that
generates a visual plan as a sequence of images, and an Action Proposal Network
(APN) that predicts the actions between them. We show the effectiveness of the
method on a simulated box stacking task as well as a T-shirt folding task
performed with a real robot.
Related papers
- Path Planning based on 2D Object Bounding-box [8.082514573754954]
We present a path planning method that utilizes 2D bounding boxes of objects, developed through imitation learning in urban driving scenarios.
This is achieved by integrating high-definition (HD) map data with images captured by surrounding cameras.
We evaluate our model on the nuPlan planning task and observed that it performs competitively in comparison to existing vision-centric methods.
arXiv Detail & Related papers (2024-02-22T19:34:56Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - Compositional Foundation Models for Hierarchical Planning [52.18904315515153]
We propose a foundation model which leverages expert foundation model trained on language, vision and action data individually together to solve long-horizon tasks.
We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model.
Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos.
arXiv Detail & Related papers (2023-09-15T17:44:05Z) - Graph-Transporter: A Graph-based Learning Method for Goal-Conditioned
Deformable Object Rearranging Task [1.807492010338763]
We present a novel framework, Graph-Transporter, for goal-conditioned deformable object rearranging tasks.
Our framework adopts an architecture based on Fully Convolutional Network (FCN) to output pixel-wise pick-and-place actions from only visual input.
arXiv Detail & Related papers (2023-02-21T05:21:04Z) - Long-Horizon Planning and Execution with Functional Object-Oriented
Networks [79.94575713911189]
We introduce the idea of exploiting object-level knowledge as a FOON for task planning and execution.
Our approach automatically transforms FOON into PDDL and leverages off-the-shelf planners, action contexts, and robot skills.
We demonstrate our approach on long-horizon tasks in CoppeliaSim and show how learned action contexts can be extended to never-before-seen scenarios.
arXiv Detail & Related papers (2022-07-12T19:29:35Z) - Enabling Visual Action Planning for Object Manipulation through Latent
Space Roadmap [72.01609575400498]
We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces.
We propose a Latent Space Roadmap (LSR) for task planning, a graph-based structure capturing globally the system dynamics in a low-dimensional latent space.
We present a thorough investigation of our framework on two simulated box stacking tasks and a folding task executed on a real robot.
arXiv Detail & Related papers (2021-03-03T17:48:26Z) - A Long Horizon Planning Framework for Manipulating Rigid Pointcloud
Objects [25.428781562909606]
We present a framework for solving long-horizon planning problems involving manipulation of rigid objects.
Our method plans in the space of object subgoals and frees the planner from reasoning about robot-object interaction dynamics.
arXiv Detail & Related papers (2020-11-16T18:59:33Z) - Hallucinative Topological Memory for Zero-Shot Visual Planning [86.20780756832502]
In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline.
Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans.
Here, we propose a simple VP method that plans directly in image space and displays competitive performance.
arXiv Detail & Related papers (2020-02-27T18:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.