Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3)
for Visual Robotic Manipulation
- URL: http://arxiv.org/abs/2309.02685v3
- Date: Tue, 28 Nov 2023 11:28:34 GMT
- Title: Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3)
for Visual Robotic Manipulation
- Authors: Hyunwoo Ryu, Jiwoo Kim, Hyunseok An, Junwoo Chang, Joohwan Seo, Taehan
Kim, Yubin Kim, Chaewon Hwang, Jongeun Choi, Roberto Horowitz
- Abstract summary: Diffusion-EDFs is a novel SE(3)-equivariant diffusion-based approach for visual robotic manipulation tasks.
We show that our proposed method achieves remarkable data efficiency, requiring only 5 to 10 human demonstrations for effective end-to-end training in less than an hour.
- Score: 5.11432473998551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion generative modeling has become a promising approach for learning
robotic manipulation tasks from stochastic human demonstrations. In this paper,
we present Diffusion-EDFs, a novel SE(3)-equivariant diffusion-based approach
for visual robotic manipulation tasks. We show that our proposed method
achieves remarkable data efficiency, requiring only 5 to 10 human
demonstrations for effective end-to-end training in less than an hour.
Furthermore, our benchmark experiments demonstrate that our approach has
superior generalizability and robustness compared to state-of-the-art methods.
Lastly, we validate our methods with real hardware experiments. Project
Website: https://sites.google.com/view/diffusion-edfs/home
Related papers
- ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy [11.454229873419697]
We propose ET-SEED, an efficient trajectory-level SE(3) equivariant diffusion model for generating action sequences in complex robot manipulation tasks.
We theoretically extend equivariant Markov kernels and simplify the condition of equivariant diffusion process.
Experiments demonstrate superior data efficiency and manipulation proficiency of our proposed method, as well as its ability to generalize to unseen configurations.
arXiv Detail & Related papers (2024-11-06T15:30:42Z) - Training-free Diffusion Model Alignment with Sampling Demons [15.400553977713914]
We propose an optimization approach, dubbed Demon, to guide the denoising process at inference time without backpropagation through reward functions or model retraining.
Our approach works by controlling noise distribution in denoising steps to concentrate density on regions corresponding to high rewards through optimization.
To the best of our knowledge, the proposed approach is the first inference-time, backpropagation-free preference alignment method for diffusion models.
arXiv Detail & Related papers (2024-10-08T07:33:49Z) - Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution [81.81748032199813]
We propose a Distillation-Free One-Step Diffusion model.
Specifically, we propose a noise-aware discriminator (NAD) to participate in adversarial training.
We improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation [16.272352213590313]
Diffusion models have been verified to be effective in generating complex distributions from natural images to motion trajectories.
Recent methods show impressive performance in 3D robotic manipulation tasks, whereas they suffer from severe runtime inefficiency due to multiple denoising steps.
We propose a real-time robotic manipulation model named ManiCM that imposes the consistency constraint to the diffusion process.
arXiv Detail & Related papers (2024-06-03T17:59:23Z) - Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Diffusion-based 3D Object Detection with Random Boxes [58.43022365393569]
Existing anchor-based 3D detection methods rely on empiricals setting of anchors, which makes the algorithms lack elegance.
Our proposed Diff3Det migrates the diffusion model to proposal generation for 3D object detection by considering the detection boxes as generative targets.
In the inference stage, the model progressively refines a set of random boxes to the prediction results.
arXiv Detail & Related papers (2023-09-05T08:49:53Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - Equivariant Descriptor Fields: SE(3)-Equivariant Energy-Based Models for
End-to-End Visual Robotic Manipulation Learning [2.8388425545775386]
We present end-to-end SE(3)-equivariant models for visual robotic manipulation from a point cloud input.
We show that our models can learn from scratch without prior knowledge yet is highly sample efficient.
arXiv Detail & Related papers (2022-06-16T17:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.