SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor
Environments
- URL: http://arxiv.org/abs/2212.11922v2
- Date: Thu, 25 May 2023 12:25:24 GMT
- Title: SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor
Environments
- Authors: Evin P{\i}nar \"Ornek, Aravindhan K Krishnan, Shreekant Gayaka,
Cheng-Hao Kuo, Arnie Sen, Nassir Navab, Federico Tombari
- Abstract summary: In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner.
We present a method that uses annotated objects to learn the objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.
- Score: 67.34330257205525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object instance segmentation is a key challenge for indoor robots navigating
cluttered environments with many small objects. Limitations in 3D sensing
capabilities often make it difficult to detect every possible object. While
deep learning approaches may be effective for this problem, manually annotating
3D data for supervised learning is time-consuming. In this work, we explore
zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen
objects in a semantic category-agnostic manner. We introduce a zero-shot split
for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method
that uses annotated objects to learn the ``objectness'' of pixels and
generalize to unseen object categories in cluttered indoor environments. Our
method, SupeRGB-D, groups pixels into small patches based on geometric cues and
learns to merge the patches in a deep agglomerative clustering fashion.
SupeRGB-D outperforms existing baselines on unseen objects while achieving
similar performance on seen objects. We further show competitive results on the
real dataset OCID. With its lightweight design (0.4 MB memory requirement), our
method is extremely suitable for mobile and robotic applications. Additional
DINO features can increase performance with a higher memory requirement. The
dataset split and code are available at https://github.com/evinpinar/supergb-d.
Related papers
- Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast
Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation.
The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects.
Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z) - 3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection.
We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution.
It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z) - Learning Object-level Point Augmentor for Semi-supervised 3D Object
Detection [85.170578641966]
We propose an object-level point augmentor (OPA) that performs local transformations for semi-supervised 3D object detection.
In this way, the resultant augmentor is derived to emphasize object instances rather than irrelevant backgrounds.
Experiments on the ScanNet and SUN RGB-D datasets show that the proposed OPA performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2022-12-19T06:56:14Z) - FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments [21.393674766169543]
We introduce the Few-Shot Object Learning dataset for object recognition with a few images per object.
We captured 336 real-world objects with 9 RGB-D images per object from different views.
The evaluation results show that there is still a large margin to be improved for few-shot object classification in robotic environments.
arXiv Detail & Related papers (2022-07-06T05:57:24Z) - Learning RGB-D Feature Embeddings for Unseen Object Instance
Segmentation [67.88276573341734]
We propose a new method for unseen object instance segmentation by learning RGB-D feature embeddings from synthetic data.
A metric learning loss function is utilized to learn to produce pixel-wise feature embeddings.
We further improve the segmentation accuracy with a new two-stage clustering algorithm.
arXiv Detail & Related papers (2020-07-30T00:23:07Z) - Unseen Object Instance Segmentation for Robotic Environments [67.88276573341734]
We propose a method to segment unseen object instances in tabletop environments.
UOIS-Net is comprised of two stages: first, it operates only on depth to produce object instance center votes in 2D or 3D.
Surprisingly, our framework is able to learn from synthetic RGB-D data where the RGB is non-photorealistic.
arXiv Detail & Related papers (2020-07-16T01:59:13Z) - Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with
Deep Metric Learning [5.699350798684963]
We propose a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning.
For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects.
We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score.
arXiv Detail & Related papers (2020-07-07T02:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.