Class-level Multiple Distributions Representation are Necessary for
Semantic Segmentation
- URL: http://arxiv.org/abs/2303.08029v1
- Date: Tue, 14 Mar 2023 16:10:36 GMT
- Title: Class-level Multiple Distributions Representation are Necessary for
Semantic Segmentation
- Authors: Jianjian Yin, Zhichao Zheng, Yanhui Gu, Junsheng Zhou, Yi Chen
- Abstract summary: We introduce for the first time to describe intra-class variations by multiple distributions.
We also propose a class multiple distributions consistency strategy to construct discriminative multiple distribution representations of embedded pixels.
Our approach can be seamlessly integrated into popular segmentation frameworks FCN/PSPNet/CCNet and achieve 5.61%/1.75%/0.75% mIoU improvements on ADE20K.
- Score: 9.796689408601775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing approaches focus on using class-level features to improve semantic
segmentation performance. How to characterize the relationships of intra-class
pixels and inter-class pixels is the key to extract the discriminative
representative class-level features. In this paper, we introduce for the first
time to describe intra-class variations by multiple distributions. Then,
multiple distributions representation learning(\textbf{MDRL}) is proposed to
augment the pixel representations for semantic segmentation. Meanwhile, we
design a class multiple distributions consistency strategy to construct
discriminative multiple distribution representations of embedded pixels.
Moreover, we put forward a multiple distribution semantic aggregation module to
aggregate multiple distributions of the corresponding class to enhance pixel
semantic information. Our approach can be seamlessly integrated into popular
segmentation frameworks FCN/PSPNet/CCNet and achieve 5.61\%/1.75\%/0.75\% mIoU
improvements on ADE20K. Extensive experiments on the Cityscapes, ADE20K
datasets have proved that our method can bring significant performance
improvement.
Related papers
- Relation-Aware Distribution Representation Network for Person Clustering
with Multiple Modalities [17.569843539515734]
Person clustering with multi-modal clues, including faces, bodies, and voices, is critical for various tasks.
We propose a Relation-Aware Distribution representation Network (RAD-Net) to generate a distribution representation for multi-modal clues.
Our method achieves substantial improvements of +6% and +8.2% in F-score on the Video Person-Clustering dataset.
arXiv Detail & Related papers (2023-08-01T15:04:56Z) - MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic
Segmentation [29.458735435545048]
We propose a novel soft mining contextual information beyond image paradigm named MCIBI++.
We generate a class probability distribution for each pixel representation and conduct the dataset-level context aggregation.
In the inference phase, we additionally design a coarse-to-fine iterative inference strategy to further boost the segmentation results.
arXiv Detail & Related papers (2022-09-09T18:03:52Z) - Beyond the Prototype: Divide-and-conquer Proxies for Few-shot
Segmentation [63.910211095033596]
Few-shot segmentation aims to segment unseen-class objects given only a handful of densely labeled samples.
We propose a simple yet versatile framework in the spirit of divide-and-conquer.
Our proposed approach, named divide-and-conquer proxies (DCP), allows for the development of appropriate and reliable information.
arXiv Detail & Related papers (2022-04-21T06:21:14Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Mining Contextual Information Beyond Image for Semantic Segmentation [37.783233906684444]
The paper studies the context aggregation problem in semantic image segmentation.
It proposes to mine the contextual information beyond individual images to further augment the pixel representations.
The proposed method could be effortlessly incorporated into existing segmentation frameworks.
arXiv Detail & Related papers (2021-08-26T14:34:23Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Semantic Distribution-aware Contrastive Adaptation for Semantic
Segmentation [50.621269117524925]
Domain adaptive semantic segmentation refers to making predictions on a certain target domain with only annotations of a specific source domain.
We present a semantic distribution-aware contrastive adaptation algorithm that enables pixel-wise representation alignment.
We evaluate SDCA on multiple benchmarks, achieving considerable improvements over existing algorithms.
arXiv Detail & Related papers (2021-05-11T13:21:25Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.