Space Engage: Collaborative Space Supervision for Contrastive-based
Semi-Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2307.09755v1
- Date: Wed, 19 Jul 2023 05:39:15 GMT
- Title: Space Engage: Collaborative Space Supervision for Contrastive-based
Semi-Supervised Semantic Segmentation
- Authors: Changqi Wang, Haoyu Xie, Yuhui Yuan, Chong Fu, Xiangyu Yue
- Abstract summary: Semi-Supervised Semantic (S4) aims to train a segmentation model with limited labeled images and a substantial volume of unlabeled images.
We introduce a pixel-wise contrastive learning approach in latent space (i.e., representation space) that aggregates the representations to their prototypes in a fully supervised manner.
Results on two public benchmarks demonstrate the competitive performance of our method compared with state-of-the-art methods.
- Score: 11.136170940699163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-Supervised Semantic Segmentation (S4) aims to train a segmentation model
with limited labeled images and a substantial volume of unlabeled images. To
improve the robustness of representations, powerful methods introduce a
pixel-wise contrastive learning approach in latent space (i.e., representation
space) that aggregates the representations to their prototypes in a fully
supervised manner. However, previous contrastive-based S4 methods merely rely
on the supervision from the model's output (logits) in logit space during
unlabeled training. In contrast, we utilize the outputs in both logit space and
representation space to obtain supervision in a collaborative way. The
supervision from two spaces plays two roles: 1) reduces the risk of
over-fitting to incorrect semantic information in logits with the help of
representations; 2) enhances the knowledge exchange between the two spaces.
Furthermore, unlike previous approaches, we use the similarity between
representations and prototypes as a new indicator to tilt training those
under-performing representations and achieve a more efficient contrastive
learning process. Results on two public benchmarks demonstrate the competitive
performance of our method compared with state-of-the-art methods.
Related papers
- Dual-Level Cross-Modal Contrastive Clustering [4.083185193413678]
We propose a novel image clustering framwork, named Dual-level Cross-Modal Contrastive Clustering (DXMC)
external textual information is introduced for constructing a semantic space which is adopted to generate image-text pairs.
The image-text pairs are respectively sent to pre-trained image and text encoder to obtain image and text embeddings which subsquently are fed into four well-designed networks.
arXiv Detail & Related papers (2024-09-06T18:49:45Z) - Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition [96.62264528407863]
We propose a self-supervised contrastive learning framework to excavate rich context via spatial-temporal consistency.
Inspired by the complementary property of motion and joint modalities, we first introduce first-order motion information into sign language modeling.
Our method is evaluated with extensive experiments on four public benchmarks, and achieves new state-of-the-art performance with a notable margin.
arXiv Detail & Related papers (2024-06-15T04:50:19Z) - LEAF: Unveiling Two Sides of the Same Coin in Semi-supervised Facial Expression Recognition [56.22672276092373]
Semi-supervised learning has emerged as a promising approach to tackle the challenge of label scarcity in facial expression recognition.
We propose a unified framework termed hierarchicaL dEcoupling And Fusing (LEAF) to coordinate expression-relevant representations and pseudo-labels for semi-supervised FER.
arXiv Detail & Related papers (2024-04-23T13:43:33Z) - Associating Spatially-Consistent Grouping with Text-supervised Semantic
Segmentation [117.36746226803993]
We introduce self-supervised spatially-consistent grouping with text-supervised semantic segmentation.
Considering the part-like grouped results, we further adapt a text-supervised model from image-level to region-level recognition.
Our method achieves 59.2% mIoU and 32.4% mIoU on Pascal VOC and Pascal Context benchmarks.
arXiv Detail & Related papers (2023-04-03T16:24:39Z) - Exploring Feature Representation Learning for Semi-supervised Medical
Image Segmentation [30.608293915653558]
We present a two-stage framework for semi-supervised medical image segmentation.
Key insight is to explore the feature representation learning with labeled and unlabeled (i.e., pseudo labeled) images.
A stage-adaptive contrastive learning method is proposed, containing a boundary-aware contrastive loss.
We present an aleatoric uncertainty-aware method, namely AUA, to generate higher-quality pseudo labels.
arXiv Detail & Related papers (2021-11-22T05:06:12Z) - Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast [43.40192909920495]
Cross-view feature semantic consistency and intra(inter)-class compactness(dispersion) are explored.
We propose two novel pixel-to-prototype contrast regularization terms that are conducted cross different views and within per single view of an image.
Our method can be seamlessly incorporated into existing WSSS models without any changes to the base network.
arXiv Detail & Related papers (2021-10-14T01:44:57Z) - Semi-supervised Semantic Segmentation with Directional Context-aware
Consistency [66.49995436833667]
We focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images.
A preferred high-level representation should capture the contextual information while not losing self-awareness.
We present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner.
arXiv Detail & Related papers (2021-06-27T03:42:40Z) - Margin Preserving Self-paced Contrastive Learning Towards Domain
Adaptation for Medical Image Segmentation [51.93711960601973]
We propose a novel margin preserving self-paced contrastive Learning model for cross-modal medical image segmentation.
With the guidance of progressively refined semantic prototypes, a novel margin preserving contrastive loss is proposed to boost the discriminability of embedded representation space.
Experiments on cross-modal cardiac segmentation tasks demonstrate that MPSCL significantly improves semantic segmentation performance.
arXiv Detail & Related papers (2021-03-15T15:23:10Z) - Spatially Consistent Representation Learning [12.120041613482558]
We propose a spatially consistent representation learning algorithm (SCRL) for multi-object and location-specific tasks.
We devise a novel self-supervised objective that tries to produce coherent spatial representations of a randomly cropped local region.
On various downstream localization tasks with benchmark datasets, the proposed SCRL shows significant performance improvements.
arXiv Detail & Related papers (2021-03-10T15:23:45Z) - Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning [144.94728981314717]
We propose a novel Attribute-Induced Bias Eliminating (AIBE) module for Transductive ZSL.
For the visual bias between two domains, the Mean-Teacher module is first leveraged to bridge the visual representation discrepancy between two domains.
An attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories.
Finally, for the semantic-visual bias in the unseen domain, an unseen semantic alignment constraint is designed to align visual and semantic space in an unsupervised manner.
arXiv Detail & Related papers (2020-05-31T02:08:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.