Boosting Semantic Segmentation from the Perspective of Explicit Class
Embeddings
- URL: http://arxiv.org/abs/2308.12894v1
- Date: Thu, 24 Aug 2023 16:16:10 GMT
- Title: Boosting Semantic Segmentation from the Perspective of Explicit Class
Embeddings
- Authors: Yuhe Liu, Chuanjian Liu, Kai Han, Quan Tang, Zengchang Qin
- Abstract summary: We explore the mechanism of class embeddings and have an insight that more explicit and meaningful class embeddings can be generated based on class masks purposely.
We propose ECENet, a new segmentation paradigm, in which class embeddings are obtained and enhanced explicitly during interacting with multi-stage image features.
Our ECENet outperforms its counterparts on the ADE20K dataset with much less computational cost and achieves new state-of-the-art results on PASCAL-Context dataset.
- Score: 19.997929884477628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation is a computer vision task that associates a label with
each pixel in an image. Modern approaches tend to introduce class embeddings
into semantic segmentation for deeply utilizing category semantics, and regard
supervised class masks as final predictions. In this paper, we explore the
mechanism of class embeddings and have an insight that more explicit and
meaningful class embeddings can be generated based on class masks purposely.
Following this observation, we propose ECENet, a new segmentation paradigm, in
which class embeddings are obtained and enhanced explicitly during interacting
with multi-stage image features. Based on this, we revisit the traditional
decoding process and explore inverted information flow between segmentation
masks and class embeddings. Furthermore, to ensure the discriminability and
informativity of features from backbone, we propose a Feature Reconstruction
module, which combines intrinsic and diverse branches together to ensure the
concurrence of diversity and redundancy in features. Experiments show that our
ECENet outperforms its counterparts on the ADE20K dataset with much less
computational cost and achieves new state-of-the-art results on PASCAL-Context
dataset. The code will be released at https://gitee.com/mindspore/models and
https://github.com/Carol-lyh/ECENet.
Related papers
- Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery [50.564146730579424]
We propose a Text Embedding Synthesizer (TES) to generate pseudo text embeddings for unlabelled samples.
Our method unlocks the multi-modal potentials of CLIP and outperforms the baseline methods by a large margin on all GCD benchmarks.
arXiv Detail & Related papers (2024-03-15T02:40:13Z) - CoBra: Complementary Branch Fusing Class and Semantic Knowledge for Robust Weakly Supervised Semantic Segmentation [3.4248731707266264]
We propose a novel dual branch framework consisting of two distinct architectures which provide valuable complementary knowledge of class (from CNN) and semantic (from ViT) to each branch.
Our model, through CoBra, fuses CNN and ViT's complementary outputs to create robust pseudo masks that integrate both class and semantic information effectively.
arXiv Detail & Related papers (2024-02-05T12:33:37Z) - DiffusePast: Diffusion-based Generative Replay for Class Incremental
Semantic Segmentation [73.54038780856554]
Class Incremental Semantic (CISS) extends the traditional segmentation task by incrementally learning newly added classes.
Previous work has introduced generative replay, which involves replaying old class samples generated from a pre-trained GAN.
We propose DiffusePast, a novel framework featuring a diffusion-based generative replay module that generates semantically accurate images with more reliable masks guided by different instructions.
arXiv Detail & Related papers (2023-08-02T13:13:18Z) - RaSP: Relation-aware Semantic Prior for Weakly Supervised Incremental
Segmentation [28.02204928717511]
We propose a weakly supervised approach to transfer objectness prior from the previously learned classes into the new ones.
We show how even a simple pairwise interaction between classes can significantly improve the segmentation mask quality of both old and new classes.
arXiv Detail & Related papers (2023-05-31T14:14:21Z) - A Joint Framework Towards Class-aware and Class-agnostic Alignment for
Few-shot Segmentation [11.47479526463185]
Few-shot segmentation aims to segment objects of unseen classes given only a few annotated support images.
Most existing methods simply stitch query features with independent support prototypes and segment the query image by feeding the mixed features to a decoder.
We propose a joint framework that combines more valuable class-aware and class-agnostic alignment guidance to facilitate the segmentation.
arXiv Detail & Related papers (2022-11-02T17:33:25Z) - Weak-shot Semantic Segmentation via Dual Similarity Transfer [33.18870478560099]
We propose SimFormer, which performs dual similarity transfer upon MaskFormer.
Proposal segmentation allows proposal-pixel similarity transfer from base classes to novel classes.
We also learn pixel-pixel similarity from base classes and distill such class-agnostic semantic similarity to the semantic masks of novel classes.
arXiv Detail & Related papers (2022-10-05T13:54:34Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Semantically Meaningful Class Prototype Learning for One-Shot Image
Semantic Segmentation [58.96902899546075]
One-shot semantic image segmentation aims to segment the object regions for the novel class with only one annotated image.
Recent works adopt the episodic training strategy to mimic the expected situation at testing time.
We propose to leverage the multi-class label information during the episodic training. It will encourage the network to generate more semantically meaningful features for each category.
arXiv Detail & Related papers (2021-02-22T12:07:35Z) - Exploring Cross-Image Pixel Contrast for Semantic Segmentation [130.22216825377618]
We propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.
The core idea is to enforce pixel embeddings belonging to a same semantic class to be more similar than embeddings from different classes.
Our method can be effortlessly incorporated into existing segmentation frameworks without extra overhead during testing.
arXiv Detail & Related papers (2021-01-28T11:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.