DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects
for Robotic Grasping
- URL: http://arxiv.org/abs/2304.02833v2
- Date: Mon, 27 Nov 2023 12:10:09 GMT
- Title: DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects
for Robotic Grasping
- Authors: Anas Gouda, Moritz Roidl
- Abstract summary: We develop an object detector that requires no fine-tuning and can add any object as a class just by capturing a few images of the object.
We evaluate our class-adaptive object detector on unseen datasets and compare it to a trained Mask R-CNN on those datasets.
- Score: 1.6317061277457001
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How can we segment varying numbers of objects where each specific object
represents its own separate class? To make the problem even more realistic, how
can we add and delete classes on the fly without retraining or fine-tuning?
This is the case of robotic applications where no datasets of the objects exist
or application that includes thousands of objects (E.g., in logistics) where it
is impossible to train a single model to learn all of the objects. Most current
research on object segmentation for robotic grasping focuses on class-level
object segmentation (E.g., box, cup, bottle), closed sets (specific objects of
a dataset; for example, YCB dataset), or deep learning-based template matching.
In this work, we are interested in open sets where the number of classes is
unknown, varying, and without pre-knowledge about the objects' types. We
consider each specific object as its own separate class. Our goal is to develop
an object detector that requires no fine-tuning and can add any object as a
class just by capturing a few images of the object. Our main idea is to break
the segmentation pipelines into two steps by combining unseen object
segmentation networks cascaded by class-adaptive classifiers. We evaluate our
class-adaptive object detector on unseen datasets and compare it to a trained
Mask R-CNN on those datasets. The results show that the performance varies from
practical to unsuitable depending on the environment setup and the objects
being handled. The code is available in our DoUnseen library repository.
Related papers
- 1st Place Solution for MOSE Track in CVPR 2024 PVUW Workshop: Complex Video Object Segmentation [72.54357831350762]
We propose a semantic embedding video object segmentation model and use the salient features of objects as query representations.
We trained our model on a large-scale video object segmentation dataset.
Our model achieves first place (textbf84.45%) in the test set of Complex Video Object Challenge.
arXiv Detail & Related papers (2024-06-07T03:13:46Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - A Unified Object Counting Network with Object Occupation Prior [32.32999623924954]
Existing object counting tasks are designed for a single object class.
It is inevitable to encounter newly coming data with new classes in our real world.
We build the first evolving object counting dataset and propose a unified object counting network.
arXiv Detail & Related papers (2022-12-29T06:42:51Z) - SupeRGB-D: Zero-shot Instance Segmentation in Cluttered Indoor
Environments [67.34330257205525]
In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner.
We present a method that uses annotated objects to learn the objectness'' of pixels and generalize to unseen object categories in cluttered indoor environments.
arXiv Detail & Related papers (2022-12-22T17:59:48Z) - Image Segmentation-based Unsupervised Multiple Objects Discovery [1.7674345486888503]
Unsupervised object discovery aims to localize objects in images.
We propose a fully unsupervised, bottom-up approach, for multiple objects discovery.
We provide state-of-the-art results for both unsupervised class-agnostic object detection and unsupervised image segmentation.
arXiv Detail & Related papers (2022-12-20T09:48:24Z) - FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments [21.393674766169543]
We introduce the Few-Shot Object Learning dataset for object recognition with a few images per object.
We captured 336 real-world objects with 9 RGB-D images per object from different views.
The evaluation results show that there is still a large margin to be improved for few-shot object classification in robotic environments.
arXiv Detail & Related papers (2022-07-06T05:57:24Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - Learning Open-World Object Proposals without Learning to Classify [110.30191531975804]
We propose a classification-free Object Localization Network (OLN) which estimates the objectness of each region purely by how well the location and shape of a region overlaps with any ground-truth object.
This simple strategy learns generalizable objectness and outperforms existing proposals on cross-category generalization.
arXiv Detail & Related papers (2021-08-15T14:36:02Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.