RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets
- URL: http://arxiv.org/abs/2409.11831v1
- Date: Wed, 18 Sep 2024 09:30:03 GMT
- Title: RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets
- Authors: Jikai Ye, Wanze Li, Shiraz Khan, Gregory S. Chirikjian,
- Abstract summary: Cloth state estimation is an important problem in robotics.
It is essential for the robot to know the accurate state to manipulate cloth and execute tasks such as robotic dressing, stitching, and covering/uncovering human beings.
This paper proposes a diffusion model-based pipeline that formulates the cloth state estimation as an image generation problem.
- Score: 8.267469814604311
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloth state estimation is an important problem in robotics. It is essential for the robot to know the accurate state to manipulate cloth and execute tasks such as robotic dressing, stitching, and covering/uncovering human beings. However, estimating cloth state accurately remains challenging due to its high flexibility and self-occlusion. This paper proposes a diffusion model-based pipeline that formulates the cloth state estimation as an image generation problem by representing the cloth state as an RGB image that describes the point-wise translation (translation map) between a pre-defined flattened mesh and the deformed mesh in a canonical space. Then we train a conditional diffusion-based image generation model to predict the translation map based on an observation. Experiments are conducted in both simulation and the real world to validate the performance of our method. Results indicate that our method outperforms two recent methods in both accuracy and speed.
Related papers
- Fast constrained sampling in pre-trained diffusion models [77.21486516041391]
Diffusion models have dominated the field of large, generative image models.
We propose an algorithm for fast-constrained sampling in large pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-24T14:52:38Z) - A Two-stage Personalized Virtual Try-on Framework with Shape Control and
Texture Guidance [7.302929117437442]
This paper proposes a brand new personalized virtual try-on model (PE-VITON), which uses the two stages (shape control and texture guidance) to decouple the clothing attributes.
The proposed model can effectively solve the problems of weak reduction of clothing folds, poor generation effect under complex human posture, blurred edges of clothing, and unclear texture styles in traditional try-on methods.
arXiv Detail & Related papers (2023-12-24T13:32:55Z) - Prompt-Propose-Verify: A Reliable Hand-Object-Interaction Data
Generation Framework using Foundational Models [0.0]
Diffusion models when conditioned on text prompts, generate realistic-looking images with intricate details.
But most of these pre-trained models fail to generate accurate images when it comes to human features like hands, teeth, etc.
arXiv Detail & Related papers (2023-12-23T12:59:22Z) - Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z) - R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image
Generation [74.5598315066249]
We probe into zero-shot grounded T2I generation with diffusion models.
We propose a Region and Boundary (R&B) aware cross-attention guidance approach.
arXiv Detail & Related papers (2023-10-13T05:48:42Z) - Probabilistic and Semantic Descriptions of Image Manifolds and Their
Applications [28.554065677506966]
It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space.
Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution.
We show how semantic interpretations are used to describe points on the manifold.
arXiv Detail & Related papers (2023-07-06T09:36:45Z) - SePaint: Semantic Map Inpainting via Multinomial Diffusion [12.217566404643033]
We propose SePaint, an inpainting model for semantic data based on generative multinomial diffusion.
We propose a novel and efficient condition strategy, Look-Back Condition (LB-Con), which performs one-step look-back operations.
We have conducted extensive experiments on different datasets, showing our proposed model outperforms commonly used methods in various robotic applications.
arXiv Detail & Related papers (2023-03-05T18:04:28Z) - Transmission-Guided Bayesian Generative Model for Smoke Segmentation [29.74065829663554]
Deep neural networks are prone to be overconfident for smoke segmentation due to its non-rigid shape and transparent appearance.
This is caused by both knowledge level uncertainty due to limited training data for accurate smoke segmentation and labeling level uncertainty representing the difficulty in labeling ground-truth.
We introduce a Bayesian generative model to simultaneously estimate the posterior distribution of model parameters and its predictions.
We also contribute a high-quality smoke segmentation dataset, SMOKE5K, consisting of 1,400 real and 4,000 synthetic images with pixel-wise annotation.
arXiv Detail & Related papers (2023-03-02T01:48:05Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - A Bayesian Treatment of Real-to-Sim for Deformable Object Manipulation [59.29922697476789]
We propose a novel methodology for extracting state information from image sequences via a technique to represent the state of a deformable object as a distribution embedding.
Our experiments confirm that we can estimate posterior distributions of physical properties, such as elasticity, friction and scale of highly deformable objects, such as cloth and ropes.
arXiv Detail & Related papers (2021-12-09T17:50:54Z) - Non-Homogeneous Haze Removal via Artificial Scene Prior and
Bidimensional Graph Reasoning [52.07698484363237]
We propose a Non-Homogeneous Haze Removal Network (NHRN) via artificial scene prior and bidimensional graph reasoning.
Our method achieves superior performance over many state-of-the-art algorithms for both the single image dehazing and hazy image understanding tasks.
arXiv Detail & Related papers (2021-04-05T13:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.