HGFormer: Hierarchical Grouping Transformer for Domain Generalized
Semantic Segmentation
- URL: http://arxiv.org/abs/2305.13031v1
- Date: Mon, 22 May 2023 13:33:41 GMT
- Title: HGFormer: Hierarchical Grouping Transformer for Domain Generalized
Semantic Segmentation
- Authors: Jian Ding, Nan Xue, Gui-Song Xia, Bernt Schiele, Dengxin Dai
- Abstract summary: This work studies semantic segmentation under the domain generalization setting.
We propose a novel hierarchical grouping transformer (HGFormer) to explicitly group pixels to form part-level masks and then whole-level masks.
Experiments show that HGFormer yields more robust semantic segmentation results than per-pixel classification methods and flat grouping transformers.
- Score: 113.6560373226501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current semantic segmentation models have achieved great success under the
independent and identically distributed (i.i.d.) condition. However, in
real-world applications, test data might come from a different domain than
training data. Therefore, it is important to improve model robustness against
domain differences. This work studies semantic segmentation under the domain
generalization setting, where a model is trained only on the source domain and
tested on the unseen target domain. Existing works show that Vision
Transformers are more robust than CNNs and show that this is related to the
visual grouping property of self-attention. In this work, we propose a novel
hierarchical grouping transformer (HGFormer) to explicitly group pixels to form
part-level masks and then whole-level masks. The masks at different scales aim
to segment out both parts and a whole of classes. HGFormer combines mask
classification results at both scales for class label prediction. We assemble
multiple interesting cross-domain settings by using seven public semantic
segmentation datasets. Experiments show that HGFormer yields more robust
semantic segmentation results than per-pixel classification methods and flat
grouping transformers, and outperforms previous methods significantly. Code
will be available at https://github.com/dingjiansw101/HGFormer.
Related papers
- A Lightweight Clustering Framework for Unsupervised Semantic
Segmentation [28.907274978550493]
Unsupervised semantic segmentation aims to categorize each pixel in an image into a corresponding class without the use of annotated data.
We propose a lightweight clustering framework for unsupervised semantic segmentation.
Our framework achieves state-of-the-art results on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-11-30T15:33:42Z) - High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training.
Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss.
We reformulate both techniques based on binomial posteriors of multiple independent binary problems.
This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z) - Mean Shift Mask Transformer for Unseen Object Instance Segmentation [12.371855276852195]
Mean Shift Mask Transformer (MSMFormer) is a new transformer architecture that simulates the von Mises-Fisher (vMF) mean shift clustering algorithm.
Our experiments show that MSMFormer achieves competitive performance compared to state-of-the-art methods for unseen object instance segmentation.
arXiv Detail & Related papers (2022-11-21T17:47:48Z) - Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From
Learned Pairwise Affinity [59.1823948436411]
We propose a novel approach for mask proposals, Generic Grouping Networks (GGNs)
Our approach combines a local measure of pixel affinity with instance-level mask supervision, producing a training regimen designed to make the model as generic as the data diversity allows.
arXiv Detail & Related papers (2022-04-12T22:37:49Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - SPCL: A New Framework for Domain Adaptive Semantic Segmentation via
Semantic Prototype-based Contrastive Learning [6.705297811617307]
Domain adaptation can help in transferring knowledge from a labeled source domain to an unlabeled target domain.
We propose a novel semantic prototype-based contrastive learning framework for fine-grained class alignment.
Our method is easy to implement and attains superior results compared to state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-24T09:26:07Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Semantic Distribution-aware Contrastive Adaptation for Semantic
Segmentation [50.621269117524925]
Domain adaptive semantic segmentation refers to making predictions on a certain target domain with only annotations of a specific source domain.
We present a semantic distribution-aware contrastive adaptation algorithm that enables pixel-wise representation alignment.
We evaluate SDCA on multiple benchmarks, achieving considerable improvements over existing algorithms.
arXiv Detail & Related papers (2021-05-11T13:21:25Z) - Deep Domain-Adversarial Image Generation for Domain Generalisation [115.21519842245752]
Machine learning models typically suffer from the domain shift problem when trained on a source dataset and evaluated on a target dataset of different distribution.
To overcome this problem, domain generalisation (DG) methods aim to leverage data from multiple source domains so that a trained model can generalise to unseen domains.
We propose a novel DG approach based on emphDeep Domain-Adversarial Image Generation (DDAIG)
arXiv Detail & Related papers (2020-03-12T23:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.