Grasp-Anything: Large-scale Grasp Dataset from Foundation Models
- URL: http://arxiv.org/abs/2309.09818v1
- Date: Mon, 18 Sep 2023 14:39:26 GMT
- Title: Grasp-Anything: Large-scale Grasp Dataset from Foundation Models
- Authors: An Dinh Vuong, Minh Nhat Vu, Hieu Le, Baoru Huang, Binh Huynh, Thieu
Vo, Andreas Kugi, Anh Nguyen
- Abstract summary: Foundation models possess an extensive repository of real-world knowledge, including objects we encounter in our daily lives.
We present Grasp-Anything, a new large-scale grasp dataset synthesized from foundation models to implement this solution.
We show that Grasp-Anything successfully facilitates zero-shot grasp detection on vision-based tasks and real-world robotic experiments.
- Score: 15.17542697393971
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Foundation models such as ChatGPT have made significant strides in robotic
tasks due to their universal representation of real-world domains. In this
paper, we leverage foundation models to tackle grasp detection, a persistent
challenge in robotics with broad industrial applications. Despite numerous
grasp datasets, their object diversity remains limited compared to real-world
figures. Fortunately, foundation models possess an extensive repository of
real-world knowledge, including objects we encounter in our daily lives. As a
consequence, a promising solution to the limited representation in previous
grasp datasets is to harness the universal knowledge embedded in these
foundation models. We present Grasp-Anything, a new large-scale grasp dataset
synthesized from foundation models to implement this solution. Grasp-Anything
excels in diversity and magnitude, boasting 1M samples with text descriptions
and more than 3M objects, surpassing prior datasets. Empirically, we show that
Grasp-Anything successfully facilitates zero-shot grasp detection on
vision-based tasks and real-world robotic experiments. Our dataset and code are
available at https://grasp-anything-2023.github.io.
Related papers
- Plain-Det: A Plain Multi-Dataset Object Detector [22.848784430833835]
Plain-Det offers flexibility to accommodate new datasets, in performance across diverse datasets, and training efficiency.
We conduct extensive experiments on 13 downstream datasets and Plain-Det demonstrates strong generalization capability.
arXiv Detail & Related papers (2024-07-14T05:18:06Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset [75.9621305227523]
We introduce LMSYS-Chat-1M, a large-scale dataset containing one million real-world conversations with 25 state-of-the-art large language models (LLMs)
This dataset is collected from 210K IP addresses in the wild on our Vicuna demo and Arena website.
We demonstrate its versatility through four use cases: developing content moderation models that perform similarly to GPT-4, building a safety benchmark, training instruction-following models that perform similarly to Vicuna, and creating challenging benchmark questions.
arXiv Detail & Related papers (2023-09-21T12:13:55Z) - RT-1: Robotics Transformer for Real-World Control at Scale [98.09428483862165]
We present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties.
We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks.
arXiv Detail & Related papers (2022-12-13T18:55:15Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - A Review of Deep Learning Techniques for Markerless Human Motion on
Synthetic Datasets [0.0]
Estimating human posture has recently gained increasing attention in the computer vision community.
We present a model that can predict the skeleton of an animation based solely on 2D images.
The implementation process uses DeepLabCut on its own dataset to perform many necessary steps.
arXiv Detail & Related papers (2022-01-07T15:42:50Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.