ICGNet: A Unified Approach for Instance-Centric Grasping
- URL: http://arxiv.org/abs/2401.09939v2
- Date: Thu, 9 May 2024 21:59:09 GMT
- Title: ICGNet: A Unified Approach for Instance-Centric Grasping
- Authors: René Zurbrügg, Yifan Liu, Francis Engelmann, Suryansh Kumar, Marco Hutter, Vaishakh Patil, Fisher Yu,
- Abstract summary: We introduce an end-to-end architecture for object-centric grasping.
We show the effectiveness of the proposed method by extensively evaluating it against state-of-the-art methods on synthetic datasets.
- Score: 42.92991092305974
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurate grasping is the key to several robotic tasks including assembly and household robotics. Executing a successful grasp in a cluttered environment requires multiple levels of scene understanding: First, the robot needs to analyze the geometric properties of individual objects to find feasible grasps. These grasps need to be compliant with the local object geometry. Second, for each proposed grasp, the robot needs to reason about the interactions with other objects in the scene. Finally, the robot must compute a collision-free grasp trajectory while taking into account the geometry of the target object. Most grasp detection algorithms directly predict grasp poses in a monolithic fashion, which does not capture the composability of the environment. In this paper, we introduce an end-to-end architecture for object-centric grasping. The method uses pointcloud data from a single arbitrary viewing direction as an input and generates an instance-centric representation for each partially observed object in the scene. This representation is further used for object reconstruction and grasp detection in cluttered table-top scenes. We show the effectiveness of the proposed method by extensively evaluating it against state-of-the-art methods on synthetic datasets, indicating superior performance for grasping and reconstruction. Additionally, we demonstrate real-world applicability by decluttering scenes with varying numbers of objects.
Related papers
- ShapeGrasp: Zero-Shot Task-Oriented Grasping with Large Language Models through Geometric Decomposition [8.654140442734354]
Task-oriented grasping of unfamiliar objects is a necessary skill for robots in dynamic in-home environments.
We present a novel zero-shot task-oriented grasping method leveraging a geometric decomposition of the target object into simple convex shapes.
Our approach employs minimal essential information - the object's name and the intended task - to facilitate zero-shot task-oriented grasping.
arXiv Detail & Related papers (2024-03-26T19:26:53Z) - LocaliseBot: Multi-view 3D object localisation with differentiable
rendering for robot grasping [9.690844449175948]
We focus on object pose estimation.
Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects.
We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z) - Graphical Object-Centric Actor-Critic [55.2480439325792]
We propose a novel object-centric reinforcement learning algorithm combining actor-critic and model-based approaches.
We use a transformer encoder to extract object representations and graph neural networks to approximate the dynamics of an environment.
Our algorithm performs better in a visually complex 3D robotic environment and a 2D environment with compositional structure than the state-of-the-art model-free actor-critic algorithm.
arXiv Detail & Related papers (2023-10-26T06:05:12Z) - You Only Look at One: Category-Level Object Representations for Pose
Estimation From a Single Example [26.866356430469757]
We present a method for achieving category-level pose estimation by inspection of just a single object from a desired category.
We demonstrate that our method runs in real-time, enabling a robot manipulator equipped with an RGBD sensor to perform online 6D pose estimation for novel objects.
arXiv Detail & Related papers (2023-05-22T01:32:24Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - IFOR: Iterative Flow Minimization for Robotic Object Rearrangement [92.97142696891727]
IFOR, Iterative Flow Minimization for Robotic Object Rearrangement, is an end-to-end method for the problem of object rearrangement for unknown objects.
We show that our method applies to cluttered scenes, and in the real world, while training only on synthetic data.
arXiv Detail & Related papers (2022-02-01T20:03:56Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - DemoGrasp: Few-Shot Learning for Robotic Grasping with Human
Demonstration [42.19014385637538]
We propose to teach a robot how to grasp an object with a simple and short human demonstration.
We first present a small sequence of RGB-D images displaying a human-object interaction.
This sequence is then leveraged to build associated hand and object meshes that represent the interaction.
arXiv Detail & Related papers (2021-12-06T08:17:12Z) - Semantically Grounded Object Matching for Robust Robotic Scene
Rearrangement [21.736603698556042]
We present a novel approach to object matching that uses a large pre-trained vision-language model to match objects in a cross-instance setting.
We demonstrate that this provides considerably improved matching performance in cross-instance settings.
arXiv Detail & Related papers (2021-11-15T18:39:43Z) - RICE: Refining Instance Masks in Cluttered Environments with Graph
Neural Networks [53.15260967235835]
We propose a novel framework that refines the output of such methods by utilizing a graph-based representation of instance masks.
We train deep networks capable of sampling smart perturbations to the segmentations, and a graph neural network, which can encode relations between objects, to evaluate the segmentations.
We demonstrate an application that uses uncertainty estimates generated by our method to guide a manipulator, leading to efficient understanding of cluttered scenes.
arXiv Detail & Related papers (2021-06-29T20:29:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.