Clustering as Attention: Unified Image Segmentation with Hierarchical
Clustering
- URL: http://arxiv.org/abs/2205.09949v1
- Date: Fri, 20 May 2022 03:53:56 GMT
- Title: Clustering as Attention: Unified Image Segmentation with Hierarchical
Clustering
- Authors: Teppei Suzuki
- Abstract summary: We propose a hierarchical clustering-based image segmentation scheme for deep neural networks, called HCFormer.
We interpret image segmentation, including semantic, instance, and panoptic segmentation, as a pixel clustering problem, and accomplish it by bottom-up, hierarchical clustering with deep neural networks.
- Score: 11.696069523681178
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a hierarchical clustering-based image segmentation scheme for deep
neural networks, called HCFormer. We interpret image segmentation, including
semantic, instance, and panoptic segmentation, as a pixel clustering problem,
and accomplish it by bottom-up, hierarchical clustering with deep neural
networks. Our hierarchical clustering removes the pixel decoder from
conventional segmentation models and simplifies the segmentation pipeline,
resulting in improved segmentation accuracies and interpretability. HCFormer
can address semantic, instance, and panoptic segmentation with the same
architecture because the pixel clustering is a common approach for various
image segmentation. In experiments, HCFormer achieves comparable or superior
segmentation accuracies compared to baseline methods on semantic segmentation
(55.5 mIoU on ADE20K), instance segmentation (47.1 AP on COCO), and panoptic
segmentation (55.7 PQ on COCO).
Related papers
- Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - A Lightweight Clustering Framework for Unsupervised Semantic
Segmentation [28.907274978550493]
Unsupervised semantic segmentation aims to categorize each pixel in an image into a corresponding class without the use of annotated data.
We propose a lightweight clustering framework for unsupervised semantic segmentation.
Our framework achieves state-of-the-art results on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-11-30T15:33:42Z) - SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation [91.91385816767057]
Open-vocabulary semantic segmentation strives to distinguish pixels into different semantic groups from an open set of categories.
We propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation.
Our SED method achieves mIoU score of 31.6% on ADE20K with 150 categories at 82 millisecond ($ms$) per image on a single A6000.
arXiv Detail & Related papers (2023-11-27T05:00:38Z) - Hierarchical Dense Correlation Distillation for Few-Shot Segmentation [46.696051965252934]
Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations.
We design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture.
We propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation.
arXiv Detail & Related papers (2023-03-26T08:13:12Z) - DeepCut: Unsupervised Segmentation using Graph Neural Networks
Clustering [6.447863458841379]
This study introduces a lightweight Graph Neural Network (GNN) to replace classical clustering methods.
Unlike existing methods, our GNN takes both the pair-wise affinities between local image features and the raw features as input.
We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training an image segmentation GNN.
arXiv Detail & Related papers (2022-12-12T12:31:46Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Distilling Ensemble of Explanations for Weakly-Supervised Pre-Training
of Image Segmentation Models [54.49581189337848]
We propose a method to enable the end-to-end pre-training for image segmentation models based on classification datasets.
The proposed method leverages a weighted segmentation learning procedure to pre-train the segmentation network en masse.
Experiment results show that, with ImageNet accompanied by PSSL as the source dataset, the proposed end-to-end pre-training strategy successfully boosts the performance of various segmentation models.
arXiv Detail & Related papers (2022-07-04T13:02:32Z) - Unsupervised Hierarchical Semantic Segmentation with Multiview
Cosegmentation and Clustering Transformers [47.45830503277631]
Grouping naturally has levels of granularity, creating ambiguity in unsupervised segmentation.
We deliver the first data-driven unsupervised hierarchical semantic segmentation method called Hierarchical Segment Grouping (HSG)
arXiv Detail & Related papers (2022-04-25T04:40:46Z) - Deep Hierarchical Semantic Segmentation [76.40565872257709]
hierarchical semantic segmentation (HSS) aims at structured, pixel-wise description of visual observation in terms of a class hierarchy.
HSSN casts HSS as a pixel-wise multi-label classification task, only bringing minimal architecture change to current segmentation models.
With hierarchy-induced margin constraints, HSSN reshapes the pixel embedding space, so as to generate well-structured pixel representations.
arXiv Detail & Related papers (2022-03-27T15:47:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.