VisuoSpatial Foresight for Multi-Step, Multi-Task Fabric Manipulation
- URL: http://arxiv.org/abs/2003.09044v3
- Date: Thu, 18 Feb 2021 07:10:49 GMT
- Title: VisuoSpatial Foresight for Multi-Step, Multi-Task Fabric Manipulation
- Authors: Ryan Hoque, Daniel Seita, Ashwin Balakrishna, Aditya Ganapathi, Ajay
Kumar Tanwani, Nawid Jamali, Katsu Yamane, Soshi Iba, Ken Goldberg
- Abstract summary: We extend the Visual Foresight framework to learn fabric dynamics that can be efficiently reused to accomplish different fabric manipulation tasks.
We experimentally evaluate VSF on multi-step fabric smoothing and folding tasks against 5 baseline methods in simulation and on the da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train or test time.
- Score: 24.262746504997683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robotic fabric manipulation has applications in home robotics, textiles,
senior care and surgery. Existing fabric manipulation techniques, however, are
designed for specific tasks, making it difficult to generalize across different
but related tasks. We extend the Visual Foresight framework to learn fabric
dynamics that can be efficiently reused to accomplish different fabric
manipulation tasks with a single goal-conditioned policy. We introduce
VisuoSpatial Foresight (VSF), which builds on prior work by learning visual
dynamics on domain randomized RGB images and depth maps simultaneously and
completely in simulation. We experimentally evaluate VSF on multi-step fabric
smoothing and folding tasks against 5 baseline methods in simulation and on the
da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train
or test time. Furthermore, we find that leveraging depth significantly improves
performance. RGBD data yields an 80% improvement in fabric folding success rate
over pure RGB data. Code, data, videos, and supplementary material are
available at https://sites.google.com/view/fabric-vsf/.
Related papers
- GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation [12.940189262612677]
GarmentLab is a content-rich benchmark and realistic simulation designed for deformable object and garment manipulation.
Our benchmark encompasses a diverse range of garment types, robotic systems and manipulators.
We evaluate state-of-the-art vision methods, reinforcement learning, and imitation learning approaches on these tasks.
arXiv Detail & Related papers (2024-11-02T10:09:08Z) - SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation [82.61572106180705]
This paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories.
We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data.
Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates.
arXiv Detail & Related papers (2024-09-26T17:26:16Z) - LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning [50.99807031490589]
We introduce LLARVA, a model trained with a novel instruction tuning method to unify a range of robotic learning tasks, scenarios, and environments.
We generate 8.5M image-visual trace pairs from the Open X-Embodiment dataset in order to pre-train our model.
Experiments yield strong performance, demonstrating that LLARVA performs well compared to several contemporary baselines.
arXiv Detail & Related papers (2024-06-17T17:55:29Z) - Universal Visual Decomposer: Long-Horizon Manipulation Made Easy [54.93745986073738]
Real-world robotic tasks stretch over extended horizons and encompass multiple stages.
Prior task decomposition methods require task-specific knowledge, are computationally intensive, and cannot readily be applied to new tasks.
We propose Universal Visual Decomposer (UVD), an off-the-shelf task decomposition method for visual long horizon manipulation.
We extensively evaluate UVD on both simulation and real-world tasks, and in all cases, UVD substantially outperforms baselines across imitation and reinforcement learning settings.
arXiv Detail & Related papers (2023-10-12T17:59:41Z) - Robust Visual Sim-to-Real Transfer for Robotic Manipulation [79.66851068682779]
Learning visuomotor policies in simulation is much safer and cheaper than in the real world.
However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots.
One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR)
arXiv Detail & Related papers (2023-07-28T05:47:24Z) - Robotic Fabric Flattening with Wrinkle Direction Detection [9.822493398088127]
Perception is considered one of the major challenges in DOM due to the complex dynamics and high degree of freedom of deformable objects.
We develop a novel image-processing algorithm based on Gabor filters to extract useful features from cloth.
Our algorithm can determine the direction of wrinkles on the cloth accurately in simulation as well as in real robot experiments.
arXiv Detail & Related papers (2023-03-08T21:55:15Z) - Learning Fabric Manipulation in the Real World with Human Videos [10.608723220309678]
Fabric manipulation is a long-standing challenge in robotics due to the enormous state space and complex dynamics.
Most prior methods rely heavily on simulation, which is still limited by the large sim-to-real gap of deformable objects.
A promising alternative is to learn fabric manipulation directly from watching humans perform the task.
arXiv Detail & Related papers (2022-11-05T07:09:15Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - VisuoSpatial Foresight for Physical Sequential Fabric Manipulation [22.008305401551418]
We build upon the Visual Foresight framework to learn fabric dynamics that can be efficiently reused to accomplish different sequential fabric manipulation tasks.
In this work, we vary 4 components of VSF, including data generation, the choice of visual dynamics model, cost function, and optimization procedure.
Results suggest that training visual dynamics models using longer, corner-based actions can improve the efficiency of fabric folding by 76% and enable a physical sequential fabric folding task that VSF could not previously perform with 90% reliability.
arXiv Detail & Related papers (2021-02-19T06:06:49Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z) - Learning Dense Visual Correspondences in Simulation to Smooth and Fold
Real Fabrics [35.84249614544505]
We learn visual correspondences for deformable fabrics across different configurations in simulation.
The learned correspondences can be used to compute geometrically equivalent actions in a new fabric configuration.
Results also suggest to fabrics of various colors, sizes, and shapes.
arXiv Detail & Related papers (2020-03-28T04:06:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.