VisuoSpatial Foresight for Physical Sequential Fabric Manipulation
- URL: http://arxiv.org/abs/2102.09754v1
- Date: Fri, 19 Feb 2021 06:06:49 GMT
- Title: VisuoSpatial Foresight for Physical Sequential Fabric Manipulation
- Authors: Ryan Hoque, Daniel Seita, Ashwin Balakrishna, Aditya Ganapathi, Ajay
Kumar Tanwani, Nawid Jamali, Katsu Yamane, Soshi Iba, Ken Goldberg
- Abstract summary: We build upon the Visual Foresight framework to learn fabric dynamics that can be efficiently reused to accomplish different sequential fabric manipulation tasks.
In this work, we vary 4 components of VSF, including data generation, the choice of visual dynamics model, cost function, and optimization procedure.
Results suggest that training visual dynamics models using longer, corner-based actions can improve the efficiency of fabric folding by 76% and enable a physical sequential fabric folding task that VSF could not previously perform with 90% reliability.
- Score: 22.008305401551418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robotic fabric manipulation has applications in home robotics, textiles,
senior care and surgery. Existing fabric manipulation techniques, however, are
designed for specific tasks, making it difficult to generalize across different
but related tasks. We build upon the Visual Foresight framework to learn fabric
dynamics that can be efficiently reused to accomplish different sequential
fabric manipulation tasks with a single goal-conditioned policy. We extend our
earlier work on VisuoSpatial Foresight (VSF), which learns visual dynamics on
domain randomized RGB images and depth maps simultaneously and completely in
simulation. In this earlier work, we evaluated VSF on multi-step fabric
smoothing and folding tasks against 5 baseline methods in simulation and on the
da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train
or test time. A key finding was that depth sensing significantly improves
performance: RGBD data yields an 80% improvement in fabric folding success rate
in simulation over pure RGB data. In this work, we vary 4 components of VSF,
including data generation, the choice of visual dynamics model, cost function,
and optimization procedure. Results suggest that training visual dynamics
models using longer, corner-based actions can improve the efficiency of fabric
folding by 76% and enable a physical sequential fabric folding task that VSF
could not previously perform with 90% reliability. Code, data, videos, and
supplementary material are available at
https://sites.google.com/view/fabric-vsf/.
Related papers
- GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation [12.940189262612677]
GarmentLab is a content-rich benchmark and realistic simulation designed for deformable object and garment manipulation.
Our benchmark encompasses a diverse range of garment types, robotic systems and manipulators.
We evaluate state-of-the-art vision methods, reinforcement learning, and imitation learning approaches on these tasks.
arXiv Detail & Related papers (2024-11-02T10:09:08Z) - SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation [82.61572106180705]
This paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories.
We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data.
Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates.
arXiv Detail & Related papers (2024-09-26T17:26:16Z) - A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis [9.687982148528187]
Convolutional Neural Networks (CNNs) are currently among the best texture analysis approaches.
Vision Transformers (ViTs) have been surpassing the performance of CNNs on tasks such as object recognition.
This work explores various pre-trained ViT architectures when transferred to tasks that rely on textures.
arXiv Detail & Related papers (2024-06-10T09:48:13Z) - Neural LerPlane Representations for Fast 4D Reconstruction of Deformable
Tissues [52.886545681833596]
LerPlane is a novel method for fast and accurate reconstruction of surgical scenes under a single-viewpoint setting.
LerPlane treats surgical procedures as 4D volumes and factorizes them into explicit 2D planes of static and dynamic fields.
LerPlane shares static fields, significantly reducing the workload of dynamic tissue modeling.
arXiv Detail & Related papers (2023-05-31T14:38:35Z) - Robotic Fabric Flattening with Wrinkle Direction Detection [9.822493398088127]
Perception is considered one of the major challenges in DOM due to the complex dynamics and high degree of freedom of deformable objects.
We develop a novel image-processing algorithm based on Gabor filters to extract useful features from cloth.
Our algorithm can determine the direction of wrinkles on the cloth accurately in simulation as well as in real robot experiments.
arXiv Detail & Related papers (2023-03-08T21:55:15Z) - Reconfigurable Data Glove for Reconstructing Physical and Virtual Grasps [100.72245315180433]
We present a reconfigurable data glove design to capture different modes of human hand-object interactions.
The glove operates in three modes for various downstream tasks with distinct features.
We evaluate the system's three modes by (i) recording hand gestures and associated forces, (ii) improving manipulation fluency in VR, and (iii) producing realistic simulation effects of various tool uses.
arXiv Detail & Related papers (2023-01-14T05:35:50Z) - Task2Sim : Towards Effective Pre-training and Transfer from Synthetic
Data [74.66568380558172]
We study the transferability of pre-trained models based on synthetic data generated by graphics simulators to downstream tasks.
We introduce Task2Sim, a unified model mapping downstream task representations to optimal simulation parameters.
It learns this mapping by training to find the set of best parameters on a set of "seen" tasks.
Once trained, it can then be used to predict best simulation parameters for novel "unseen" tasks in one shot.
arXiv Detail & Related papers (2021-11-30T19:25:27Z) - PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable
Physics [89.81550748680245]
We introduce a new differentiable physics benchmark called PasticineLab.
In each task, the agent uses manipulators to deform the plasticine into the desired configuration.
We evaluate several existing reinforcement learning (RL) methods and gradient-based methods on this benchmark.
arXiv Detail & Related papers (2021-04-07T17:59:23Z) - Hierarchical Dynamic Filtering Network for RGB-D Salient Object
Detection [91.43066633305662]
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
In this paper, we explore these issues from a new perspective.
We implement a kind of more flexible and efficient multi-scale cross-modal feature processing.
arXiv Detail & Related papers (2020-07-13T07:59:55Z) - Learning Dense Visual Correspondences in Simulation to Smooth and Fold
Real Fabrics [35.84249614544505]
We learn visual correspondences for deformable fabrics across different configurations in simulation.
The learned correspondences can be used to compute geometrically equivalent actions in a new fabric configuration.
Results also suggest to fabrics of various colors, sizes, and shapes.
arXiv Detail & Related papers (2020-03-28T04:06:20Z) - VisuoSpatial Foresight for Multi-Step, Multi-Task Fabric Manipulation [24.262746504997683]
We extend the Visual Foresight framework to learn fabric dynamics that can be efficiently reused to accomplish different fabric manipulation tasks.
We experimentally evaluate VSF on multi-step fabric smoothing and folding tasks against 5 baseline methods in simulation and on the da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train or test time.
arXiv Detail & Related papers (2020-03-19T23:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.