CMG-Net: An End-to-End Contact-Based Multi-Finger Dexterous Grasping
Network
- URL: http://arxiv.org/abs/2303.13182v1
- Date: Thu, 23 Mar 2023 11:29:31 GMT
- Title: CMG-Net: An End-to-End Contact-Based Multi-Finger Dexterous Grasping
Network
- Authors: Mingze Wei, Yaomin Huang, Zhiyuan Xu, Ning Liu, Zhengping Che, Xinyu
Zhang, Chaomin Shen, Feifei Feng, Chun Shan, Jian Tang
- Abstract summary: We present an effective end-to-end network, CMG-Net, for grasping unknown objects in a cluttered environment.
We create a synthetic grasp dataset that consists of five thousand cluttered scenes, 80 object categories, and 20 million annotations.
Our work significantly outperforms the state-of-the-art for three-finger robotic hands.
- Score: 25.879649629474212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel representation for grasping using contacts
between multi-finger robotic hands and objects to be manipulated. This
representation significantly reduces the prediction dimensions and accelerates
the learning process. We present an effective end-to-end network, CMG-Net, for
grasping unknown objects in a cluttered environment by efficiently predicting
multi-finger grasp poses and hand configurations from a single-shot point
cloud. Moreover, we create a synthetic grasp dataset that consists of five
thousand cluttered scenes, 80 object categories, and 20 million annotations. We
perform a comprehensive empirical study and demonstrate the effectiveness of
our grasping representation and CMG-Net. Our work significantly outperforms the
state-of-the-art for three-finger robotic hands. We also demonstrate that the
model trained using synthetic data performs very well for real robots.
Related papers
- Synatra: Turning Indirect Knowledge into Direct Demonstrations for Digital Agents at Scale [97.21851531607811]
LLMs can now act as autonomous agents that interact with digital environments and complete specific objectives.
accuracy is still far from satisfactory, partly due to a lack of large-scale, direct demonstrations for digital tasks.
We present Synatra, an approach that effectively transforms this indirect knowledge into direct supervision at scale.
arXiv Detail & Related papers (2024-09-24T00:51:45Z) - Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - HRP: Human Affordances for Robotic Pre-Training [15.92416819748365]
We present a framework for pre-training representations on hand, object, and contact.
We experimentally demonstrate (using 3000+ robot trials) that this affordance pre-training scheme boosts performance by a minimum of 15% on 5 real-world tasks.
arXiv Detail & Related papers (2024-07-26T17:59:52Z) - MimicGen: A Data Generation System for Scalable Robot Learning using
Human Demonstrations [55.549956643032836]
MimicGen is a system for automatically synthesizing large-scale, rich datasets from only a small number of human demonstrations.
We show that robot agents can be effectively trained on this generated dataset by imitation learning to achieve strong performance in long-horizon and high-precision tasks.
arXiv Detail & Related papers (2023-10-26T17:17:31Z) - DMFC-GraspNet: Differentiable Multi-Fingered Robotic Grasp Generation in
Cluttered Scenes [22.835683657191936]
Multi-fingered robotic grasping can potentially perform complex object manipulation.
Current techniques for multi-fingered robotic grasping frequently predict only a single grasp for each inference time.
This paper proposes a differentiable multi-fingered grasp generation network (DMFC-GraspNet) with three main contributions to address this challenge.
arXiv Detail & Related papers (2023-08-01T11:21:07Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.