RICE: Refining Instance Masks in Cluttered Environments with Graph
Neural Networks
- URL: http://arxiv.org/abs/2106.15711v1
- Date: Tue, 29 Jun 2021 20:29:29 GMT
- Title: RICE: Refining Instance Masks in Cluttered Environments with Graph
Neural Networks
- Authors: Christopher Xie, Arsalan Mousavian, Yu Xiang, Dieter Fox
- Abstract summary: We propose a novel framework that refines the output of such methods by utilizing a graph-based representation of instance masks.
We train deep networks capable of sampling smart perturbations to the segmentations, and a graph neural network, which can encode relations between objects, to evaluate the segmentations.
We demonstrate an application that uses uncertainty estimates generated by our method to guide a manipulator, leading to efficient understanding of cluttered scenes.
- Score: 53.15260967235835
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Segmenting unseen object instances in cluttered environments is an important
capability that robots need when functioning in unstructured environments.
While previous methods have exhibited promising results, they still tend to
provide incorrect results in highly cluttered scenes. We postulate that a
network architecture that encodes relations between objects at a high-level can
be beneficial. Thus, in this work, we propose a novel framework that refines
the output of such methods by utilizing a graph-based representation of
instance masks. We train deep networks capable of sampling smart perturbations
to the segmentations, and a graph neural network, which can encode relations
between objects, to evaluate the perturbed segmentations. Our proposed method
is orthogonal to previous works and achieves state-of-the-art performance when
combined with them. We demonstrate an application that uses uncertainty
estimates generated by our method to guide a manipulator, leading to efficient
understanding of cluttered scenes. Code, models, and video can be found at
https://github.com/chrisdxie/rice .
Related papers
- RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant
Features [6.358423536732677]
We introduce a novel approach to correct inaccurate segmentation by using robot interaction and a designed body frame-invariant feature.
We demonstrate the effectiveness of our proposed interactive perception pipeline in accurately segmenting cluttered scenes by achieving an average object segmentation accuracy rate of 80.7%.
arXiv Detail & Related papers (2024-03-04T05:03:24Z) - ICGNet: A Unified Approach for Instance-Centric Grasping [42.92991092305974]
We introduce an end-to-end architecture for object-centric grasping.
We show the effectiveness of the proposed method by extensively evaluating it against state-of-the-art methods on synthetic datasets.
arXiv Detail & Related papers (2024-01-18T12:41:41Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Spatiotemporal Graph Neural Network based Mask Reconstruction for Video
Object Segmentation [70.97625552643493]
This paper addresses the task of segmenting class-agnostic objects in semi-supervised setting.
We propose a novel graph neuralS network (TG-Net) which captures the local contexts by utilizing all proposals.
arXiv Detail & Related papers (2020-12-10T07:57:44Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - Instance Segmentation of Visible and Occluded Regions for Finding and
Picking Target from a Pile of Objects [25.836334764387498]
We present a robotic system for picking a target from a pile of objects that is capable of finding and grasping the target object.
We extend an existing instance segmentation model with a novel relook' architecture, in which the model explicitly learns the inter-instance relationship.
Also, by using image synthesis, we make the system capable of handling new objects without human annotations.
arXiv Detail & Related papers (2020-01-21T12:28:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.