Related papers: Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals

Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals

URL: http://arxiv.org/abs/2404.16818v1
Date: Thu, 25 Apr 2024 17:58:09 GMT
Title: Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals
Authors: Oliver Hahn, Nikita Araslanov, Simone Schaub-Meyer, Stefan Roth,
Abstract summary: Unsupervised semantic segmentation aims to automatically partition images into semantically meaningful regions by identifying global categories within an image corpus without any form of annotation. We present PriMaPs - Principal Mask Proposals - which decomposing images into semantically meaningful masks based on their feature representation. This allows us to realize unsupervised semantic segmentation by fitting class prototypes to PriMaPs with an expectation-maximization algorithm, PriMaPs-EM. PriMaPs-EM leads to competitive results across various pre-trained backbone models, including DINO and DINOv2, and across datasets, such as Cityscapes, COCO-
Score: 15.258631373740686
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised semantic segmentation aims to automatically partition images into semantically meaningful regions by identifying global categories within an image corpus without any form of annotation. Building upon recent advances in self-supervised representation learning, we focus on how to leverage these large pre-trained models for the downstream task of unsupervised segmentation. We present PriMaPs - Principal Mask Proposals - decomposing images into semantically meaningful masks based on their feature representation. This allows us to realize unsupervised semantic segmentation by fitting class prototypes to PriMaPs with a stochastic expectation-maximization algorithm, PriMaPs-EM. Despite its conceptual simplicity, PriMaPs-EM leads to competitive results across various pre-trained backbone models, including DINO and DINOv2, and across datasets, such as Cityscapes, COCO-Stuff, and Potsdam-3. Importantly, PriMaPs-EM is able to boost results when applied orthogonally to current state-of-the-art unsupervised semantic segmentation pipelines.

Related papers

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation [82.45303116125021]
This work presents Self-supervised Open-world Hierarchical Entities (SOHES), a novel approach that eliminates the need for human annotations. We produce abundant high-quality pseudo-labels through visual feature clustering, and rectify the noises in pseudo-labels via a teacher- mutual-learning procedure. Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation.
arXiv Detail & Related papers (2024-04-18T17:59:46Z)
Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks. U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models. We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z)
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation [110.09800389100599]
We propose MixReorg, a novel and straightforward pre-training paradigm for semantic segmentation. Our approach involves generating fine-grained patch-text pairs data by mixing image patches while preserving the correspondence between patches and text. With MixReorg as a mask learner, conventional text-supervised semantic segmentation models can achieve highly generalizable pixel-semantic alignment ability.
arXiv Detail & Related papers (2023-08-09T09:35:16Z)
Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts. We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query. Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z)
Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation. Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z)
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations. We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios. Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z)
Per-Pixel Classification is Not All You Need for Semantic Segmentation [184.2905747595058]
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks. We propose MaskFormer, a simple mask classification model which predicts a set of binary masks. Our method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.
arXiv Detail & Related papers (2021-07-13T17:59:50Z)
Unsupervised Image Segmentation by Mutual Information Maximization and Adversarial Regularization [7.165364364478119]
We propose a novel fully unsupervised semantic segmentation method, the so-called Information Maximization and Adrial Regularization (InMARS) Inspired by human perception which parses a scene into perceptual groups, our proposed approach first partitions an input image into meaningful regions (also known as superpixels) Next, it utilizes Mutual-Information-Maximization followed by an adversarial training strategy to cluster these regions into semantically meaningful classes. Our experiments demonstrate that our method achieves the state-of-the-art performance on two commonly used unsupervised semantic segmentation datasets.
arXiv Detail & Related papers (2021-07-01T18:36:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.