Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation
- URL: http://arxiv.org/abs/2206.02881v1
- Date: Mon, 6 Jun 2022 20:15:02 GMT
- Title: Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation
- Authors: Zixuan Huang, Xingyu Lin, David Held
- Abstract summary: Self-occlusion is challenging for cloth manipulation, as it makes it difficult to estimate the full state of the cloth.
We leverage recent advances in pose estimation for cloth to build a system that uses explicit occlusion reasoning to unfold a crumpled cloth.
- Score: 18.288330275993328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-occlusion is challenging for cloth manipulation, as it makes it
difficult to estimate the full state of the cloth. Ideally, a robot trying to
unfold a crumpled or folded cloth should be able to reason about the cloth's
occluded regions. We leverage recent advances in pose estimation for cloth to
build a system that uses explicit occlusion reasoning to unfold a crumpled
cloth. Specifically, we first learn a model to reconstruct the mesh of the
cloth. However, the model will likely have errors due to the complexities of
the cloth configurations and due to ambiguities from occlusions. Our main
insight is that we can further refine the predicted reconstruction by
performing test-time finetuning with self-supervised losses. The obtained
reconstructed mesh allows us to use a mesh-based dynamics model for planning
while reasoning about occlusions. We evaluate our system both on cloth
flattening as well as on cloth canonicalization, in which the objective is to
manipulate the cloth into a canonical pose. Our experiments show that our
method significantly outperforms prior methods that do not explicitly account
for occlusions or perform test-time optimization.
Related papers
- RaggeDi: Diffusion-based State Estimation of Disordered Rags, Sheets, Towels and Blankets [8.267469814604311]
Cloth state estimation is an important problem in robotics.
It is essential for the robot to know the accurate state to manipulate cloth and execute tasks such as robotic dressing, stitching, and covering/uncovering human beings.
This paper proposes a diffusion model-based pipeline that formulates the cloth state estimation as an image generation problem.
arXiv Detail & Related papers (2024-09-18T09:30:03Z) - Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model architecture.
We represent a scene layout by a 2D floor plan and a set of objects, each defined by its category, location, size, and orientation.
Our experimental results demonstrate that MiDiffusion substantially outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z) - PlaNet-ClothPick: Effective Fabric Flattening Based on Latent Dynamic
Planning [0.0]
Recent work has attributed this to the blurry prediction of the observation, which makes it difficult to plan directly in the latent space.
We find that the sharp discontinuity of the transition function on the contour of the fabric makes it difficult to learn an accurate latent dynamic model.
Our model exhibits a faster action inference and requires fewer transitional model parameters than the state-of-the-art robotic systems in this domain.
arXiv Detail & Related papers (2023-03-02T15:22:34Z) - Self-supervised Cloth Reconstruction via Action-conditioned Cloth
Tracking [18.288330275993328]
We propose a self-supervised method to finetune a mesh reconstruction model in the real world.
We show that we can improve the quality of the reconstructed mesh without requiring human annotations.
arXiv Detail & Related papers (2023-02-19T07:48:12Z) - OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided
Mixup [79.3118064406151]
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes)
Prior methods successfully preserve the character of clothing images.
Occlusion remains a pernicious effect for realistic virtual try-on.
arXiv Detail & Related papers (2023-01-03T06:29:11Z) - SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video [48.23424267130425]
SelfRecon recovers space-time coherent geometries from a monocular self-rotating human video.
Explicit methods require a predefined template mesh for a given sequence, while the template is hard to acquire for a specific subject.
Implicit methods support arbitrary topology and have high quality due to continuous geometric representation.
arXiv Detail & Related papers (2022-01-30T11:49:29Z) - N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks [69.94313958962165]
We present a novel mesh-based learning approach (N-Cloth) for plausible 3D cloth deformation prediction.
We use graph convolution to transform the cloth and object meshes into a latent space to reduce the non-linearity in the mesh space.
Our approach can handle complex cloth meshes with up to $100$K triangles and scenes with various objects corresponding to SMPL humans, Non-SMPL humans, or rigid bodies.
arXiv Detail & Related papers (2021-12-13T03:13:11Z) - Probabilistic Modeling for Human Mesh Recovery [73.11532990173441]
This paper focuses on the problem of 3D human reconstruction from 2D evidence.
We recast the problem as learning a mapping from the input to a distribution of plausible 3D poses.
arXiv Detail & Related papers (2021-08-26T17:55:11Z) - MonoClothCap: Towards Temporally Coherent Clothing Capture from
Monocular RGB Video [10.679773937444445]
We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input.
We build statistical deformation models for three types of clothing: T-shirt, short pants and long pants.
Our method produces temporally coherent reconstruction of body and clothing from monocular video.
arXiv Detail & Related papers (2020-09-22T17:54:38Z) - How do Decisions Emerge across Layers in Neural Models? Interpretation
with Differentiable Masking [70.92463223410225]
DiffMask learns to mask-out subsets of the input while maintaining differentiability.
Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers.
This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
arXiv Detail & Related papers (2020-04-30T17:36:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.