ARMBench: An Object-centric Benchmark Dataset for Robotic Manipulation
- URL: http://arxiv.org/abs/2303.16382v1
- Date: Wed, 29 Mar 2023 01:42:54 GMT
- Title: ARMBench: An Object-centric Benchmark Dataset for Robotic Manipulation
- Authors: Chaitanya Mitash, Fan Wang, Shiyang Lu, Vikedo Terhuja, Tyler Garaas,
Felipe Polido, Manikantan Nambi
- Abstract summary: ARMBench is a large-scale, object-centric benchmark dataset for robotic manipulation in the context of a warehouse.
We present a large-scale dataset collected in an Amazon warehouse using a robotic manipulator performing object singulation.
- Score: 9.551453254490125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces Amazon Robotic Manipulation Benchmark (ARMBench), a
large-scale, object-centric benchmark dataset for robotic manipulation in the
context of a warehouse. Automation of operations in modern warehouses requires
a robotic manipulator to deal with a wide variety of objects, unstructured
storage, and dynamically changing inventory. Such settings pose challenges in
perceiving the identity, physical characteristics, and state of objects during
manipulation. Existing datasets for robotic manipulation consider a limited set
of objects or utilize 3D models to generate synthetic scenes with limitation in
capturing the variety of object properties, clutter, and interactions. We
present a large-scale dataset collected in an Amazon warehouse using a robotic
manipulator performing object singulation from containers with heterogeneous
contents. ARMBench contains images, videos, and metadata that corresponds to
235K+ pick-and-place activities on 190K+ unique objects. The data is captured
at different stages of manipulation, i.e., pre-pick, during transfer, and after
placement. Benchmark tasks are proposed by virtue of high-quality annotations
and baseline performance evaluation are presented on three visual perception
challenges, namely 1) object segmentation in clutter, 2) object identification,
and 3) defect detection. ARMBench can be accessed at http://armbench.com
Related papers
- PickScan: Object discovery and reconstruction from handheld interactions [99.99566882133179]
We develop an interaction-guided and class-agnostic method to reconstruct 3D representations of scenes.
Our main contribution is a novel approach to detecting user-object interactions and extracting the masks of manipulated objects.
Compared to Co-Fusion, the only comparable interaction-based and class-agnostic baseline, this corresponds to a reduction in chamfer distance of 73%.
arXiv Detail & Related papers (2024-11-17T23:09:08Z) - Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds [45.87961177297602]
This work aims to integrate recent methods into a comprehensive framework for robotic interaction and manipulation in human-centric environments.
Specifically, we leverage 3D reconstructions from a commodity 3D scanner for open-vocabulary instance segmentation.
We show the performance and robustness of our model in two sets of real-world experiments including dynamic object retrieval and drawer opening.
arXiv Detail & Related papers (2024-04-18T18:01:15Z) - Multi-task real-robot data with gaze attention for dual-arm fine manipulation [4.717749411286867]
This paper introduces a dataset of diverse object manipulations that includes dual-arm tasks and/or tasks requiring fine manipulation.
We have generated dataset with 224k episodes (150 hours, 1,104 language instructions) which includes dual-arm fine tasks such as bowl-moving, pencil-case opening or banana-peeling.
This dataset includes visual attention signals as well as dual-action labels, a signal that separates actions into a robust reaching trajectory and precise interaction with objects, and language instructions to achieve robust and precise object manipulation.
arXiv Detail & Related papers (2024-01-15T11:20:34Z) - M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place [44.303123422422246]
M2T2 is a single model that supplies different types of low-level actions that work robustly on arbitrary objects in cluttered scenes.
M2T2 is trained on a large-scale synthetic dataset with 128K scenes and achieves zero-shot sim2real transfer on the real robot.
arXiv Detail & Related papers (2023-11-02T01:42:52Z) - GAMMA: Generalizable Articulation Modeling and Manipulation for
Articulated Objects [53.965581080954905]
We propose a novel framework of Generalizable Articulation Modeling and Manipulating for Articulated Objects (GAMMA)
GAMMA learns both articulation modeling and grasp pose affordance from diverse articulated objects with different categories.
Results show that GAMMA significantly outperforms SOTA articulation modeling and manipulation algorithms in unseen and cross-category articulated objects.
arXiv Detail & Related papers (2023-09-28T08:57:14Z) - ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose
Annotations, Affordances, and Reconstructions [17.9178233068395]
We present the HANDAL dataset for category-level object pose estimation and affordance prediction.
The dataset consists of 308k annotated image frames from 2.2k videos of 212 real-world objects in 17 categories.
We outline the usefulness of our dataset for 6-DoF category-level pose+scale estimation and related tasks.
arXiv Detail & Related papers (2023-08-02T23:59:59Z) - SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor
Environments [67.34330257205525]
In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner.
We present a method that uses annotated objects to learn the objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.
arXiv Detail & Related papers (2022-12-22T17:59:48Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Neural Descriptor Fields: SE(3)-Equivariant Object Representations for
Manipulation [75.83319382105894]
We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target.
NDFs are trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints.
Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors.
arXiv Detail & Related papers (2021-12-09T18:57:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.