Enhancing Object Discovery for Unsupervised Instance Segmentation and Object Detection
- URL: http://arxiv.org/abs/2508.02386v1
- Date: Mon, 04 Aug 2025 13:10:39 GMT
- Title: Enhancing Object Discovery for Unsupervised Instance Segmentation and Object Detection
- Authors: Xingyu Feng, Hebei Gao, Hong Li,
- Abstract summary: COLER is a zero-shot unsupervised model that outperforms previous state-of-the-art methods on multiple benchmarks.<n>We have designed several novel yet simple modules that allow CutOnce to fully leverage the object discovery capabilities of self-supervised models.
- Score: 2.0306212295074366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Cut-Once-and-LEaRn (COLER), a simple approach for unsupervised instance segmentation and object detection. COLER first uses our developed CutOnce to generate coarse pseudo labels, then enables the detector to learn from these masks. CutOnce applies Normalized Cut only once and does not rely on any clustering methods, but it can generate multiple object masks in an image. We have designed several novel yet simple modules that not only allow CutOnce to fully leverage the object discovery capabilities of self-supervised models, but also free it from reliance on mask post-processing. During training, COLER achieves strong performance without requiring specially designed loss functions for pseudo labels, and its performance is further improved through self-training. COLER is a zero-shot unsupervised model that outperforms previous state-of-the-art methods on multiple benchmarks.We believe our method can help advance the field of unsupervised object localization.
Related papers
- ProMerge: Prompt and Merge for Unsupervised Instance Segmentation [4.297070083645049]
Unsupervised instance segmentation aims to segment distinct object instances in an image without relying on human-labeled data.
Recent state-of-the-art approaches use self-supervised features to represent images as graphs and solve a generalized eigenvalue system to generate foreground masks.
We propose Prompt and Merge (ProMerge), which leverages self-supervised visual features to obtain initial groupings of patches and applies a strategic merging to these segments.
arXiv Detail & Related papers (2024-09-27T17:59:42Z) - Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - Segment, Select, Correct: A Framework for Weakly-Supervised Referring Segmentation [63.13635858586001]
Referring Image (RIS) is the problem of identifying objects in images through natural language sentences.
We propose a novel weakly-supervised framework that tackles RIS by decomposing it into three steps.
Using only the first two steps (zero-shot segment and select) outperforms other zero-shot baselines by as much as 16.5%.
arXiv Detail & Related papers (2023-10-20T13:20:17Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - Cut and Learn for Unsupervised Object Detection and Instance
Segmentation [65.43627672225624]
Cut-and-LEaRn (CutLER) is a simple approach for training unsupervised object detection and segmentation models.
CutLER is a zero-shot unsupervised detector and improves detection performance AP50 by over 2.7 times on 11 benchmarks.
arXiv Detail & Related papers (2023-01-26T18:57:13Z) - Unsupervised Object Localization: Observing the Background to Discover
Objects [4.870509580034194]
In this work, we take a different approach and propose to look for the background instead.
This way, the salient objects emerge as a by-product without any strong assumption on what an object should be.
We propose FOUND, a simple model made of a single $conv1times1$ with coarse background masks extracted from self-supervised patch-based representations.
arXiv Detail & Related papers (2022-12-15T13:43:11Z) - Object-wise Masked Autoencoders for Fast Pre-training [13.757095663704858]
We show that current masked image encoding models learn the underlying relationship between all objects in the whole scene, instead of a single object representation.
We introduce a novel object selection and division strategy to drop non-object patches for learning object-wise representations by selective reconstruction with interested region masks.
Experiments on four commonly-used datasets demonstrate the effectiveness of our model in reducing the compute cost by 72% while achieving competitive performance.
arXiv Detail & Related papers (2022-05-28T05:13:45Z) - FreeSOLO: Learning to Segment Objects without Annotations [191.82134817449528]
We present FreeSOLO, a self-supervised instance segmentation framework built on top of the simple instance segmentation method SOLO.
Our method also presents a novel localization-aware pre-training framework, where objects can be discovered from complicated scenes in an unsupervised manner.
arXiv Detail & Related papers (2022-02-24T16:31:44Z) - Weakly-Supervised Saliency Detection via Salient Object Subitizing [57.17613373230722]
We introduce saliency subitizing as the weak supervision since it is class-agnostic.
This allows the supervision to be aligned with the property of saliency detection.
We conduct extensive experiments on five benchmark datasets.
arXiv Detail & Related papers (2021-01-04T12:51:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.