D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation
- URL: http://arxiv.org/abs/2403.12861v1
- Date: Tue, 19 Mar 2024 16:05:51 GMT
- Title: D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation
- Authors: Jun Yamada, Shaohong Zhong, Jack Collins, Ingmar Posner,
- Abstract summary: D-Cubed is a novel trajectory optimisation method using a latent diffusion model (LDM) trained from a task-agnostic play dataset.
We demonstrate that D-Cubed outperforms traditional trajectory optimisation and competitive baseline approaches by a significant margin.
- Score: 15.680133621889809
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mastering dexterous robotic manipulation of deformable objects is vital for overcoming the limitations of parallel grippers in real-world applications. Current trajectory optimisation approaches often struggle to solve such tasks due to the large search space and the limited task information available from a cost function. In this work, we propose D-Cubed, a novel trajectory optimisation method using a latent diffusion model (LDM) trained from a task-agnostic play dataset to solve dexterous deformable object manipulation tasks. D-Cubed learns a skill-latent space that encodes short-horizon actions in the play dataset using a VAE and trains a LDM to compose the skill latents into a skill trajectory, representing a long-horizon action trajectory in the dataset. To optimise a trajectory for a target task, we introduce a novel gradient-free guided sampling method that employs the Cross-Entropy method within the reverse diffusion process. In particular, D-Cubed samples a small number of noisy skill trajectories using the LDM for exploration and evaluates the trajectories in simulation. Then, D-Cubed selects the trajectory with the lowest cost for the subsequent reverse process. This effectively explores promising solution areas and optimises the sampled trajectories towards a target task throughout the reverse diffusion process. Through empirical evaluation on a public benchmark of dexterous deformable object manipulation tasks, we demonstrate that D-Cubed outperforms traditional trajectory optimisation and competitive baseline approaches by a significant margin. We further demonstrate that trajectories found by D-Cubed readily transfer to a real-world LEAP hand on a folding task.
Related papers
- TraDiffusion: Trajectory-Based Training-Free Image Generation [85.39724878576584]
We propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion.
This novel method allows users to effortlessly guide image generation via mouse trajectories.
arXiv Detail & Related papers (2024-08-19T07:01:43Z) - TrajCogn: Leveraging LLMs for Cognizing Movement Patterns and Travel Purposes from Trajectories [24.44686757572976]
S-temporal trajectories are crucial in various data mining tasks.
It is important to develop a versatile trajectory learning method that performs different tasks with high accuracy.
This is challenging due to limitations in model capacity and the quality and scale of trajectory datasets.
arXiv Detail & Related papers (2024-05-21T02:33:17Z) - Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following [21.81411085058986]
Reward-gradient guided denoising generates trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model.
We propose DiffusionES, a method that combines gradient-free optimization with trajectory denoising.
We show that DiffusionES achieves state-of-the-art performance on nuPlan, an established closed-loop planning benchmark for autonomous driving.
arXiv Detail & Related papers (2024-02-09T17:18:33Z) - Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Learning Representative Trajectories of Dynamical Systems via
Domain-Adaptive Imitation [0.0]
We propose DATI, a deep reinforcement learning agent designed for domain-adaptive trajectory imitation.
Our experiments show that DATI outperforms baseline methods for imitation learning and optimal control in this setting.
Its generalization to a real-world scenario is shown through the discovery of abnormal motion patterns in maritime traffic.
arXiv Detail & Related papers (2023-04-19T15:53:48Z) - DiffSkill: Skill Abstraction from Differentiable Physics for Deformable
Object Manipulations with Tools [96.38972082580294]
DiffSkill is a novel framework that uses a differentiable physics simulator for skill abstraction to solve deformable object manipulation tasks.
In particular, we first obtain short-horizon skills using individual tools from a gradient-based simulator.
We then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input.
arXiv Detail & Related papers (2022-03-31T17:59:38Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Continuous Transition: Improving Sample Efficiency for Continuous
Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition.
Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions.
To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z) - Imitative Planning using Conditional Normalizing Flow [2.8978926857710263]
A popular way to plan trajectories in dynamic urban scenarios for Autonomous Vehicles is to rely on explicitly specified and hand crafted cost functions.
We explore the application of normalizing flows for improving the performance of trajectory planning for autonomous vehicles (AVs)
By modeling the trajectory planner's cost manifold as an energy function, we learn a scene conditioned mapping from the prior to a Boltzmann distribution over the AV control space.
arXiv Detail & Related papers (2020-07-31T16:32:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.