Leveraging Hidden Positives for Unsupervised Semantic Segmentation
- URL: http://arxiv.org/abs/2303.15014v1
- Date: Mon, 27 Mar 2023 08:57:28 GMT
- Title: Leveraging Hidden Positives for Unsupervised Semantic Segmentation
- Authors: Hyun Seok Seong, WonJun Moon, SuBeen Lee, Jae-Pil Heo
- Abstract summary: We leverage contrastive learning by excavating hidden positives to learn rich semantic relationships.
We introduce a gradient propagation strategy to learn semantic consistency between adjacent patches.
Our proposed method achieves new state-of-the-art (SOTA) results in COCO-stuff, Cityscapes, and Potsdam-3 datasets.
- Score: 5.937673383513695
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dramatic demand for manpower to label pixel-level annotations triggered the
advent of unsupervised semantic segmentation. Although the recent work
employing the vision transformer (ViT) backbone shows exceptional performance,
there is still a lack of consideration for task-specific training guidance and
local semantic consistency. To tackle these issues, we leverage contrastive
learning by excavating hidden positives to learn rich semantic relationships
and ensure semantic consistency in local regions. Specifically, we first
discover two types of global hidden positives, task-agnostic and task-specific
ones for each anchor based on the feature similarities defined by a fixed
pre-trained backbone and a segmentation head-in-training, respectively. A
gradual increase in the contribution of the latter induces the model to capture
task-specific semantic features. In addition, we introduce a gradient
propagation strategy to learn semantic consistency between adjacent patches,
under the inherent premise that nearby patches are highly likely to possess the
same semantics. Specifically, we add the loss propagating to local hidden
positives, semantically similar nearby patches, in proportion to the predefined
similarity scores. With these training schemes, our proposed method achieves
new state-of-the-art (SOTA) results in COCO-stuff, Cityscapes, and Potsdam-3
datasets. Our code is available at: https://github.com/hynnsk/HP.
Related papers
- Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - Semantic Connectivity-Driven Pseudo-labeling for Cross-domain
Segmentation [89.41179071022121]
Self-training is a prevailing approach in cross-domain semantic segmentation.
We propose a novel approach called Semantic Connectivity-driven pseudo-labeling.
This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics.
arXiv Detail & Related papers (2023-12-11T12:29:51Z) - SmooSeg: Smoothness Prior for Unsupervised Semantic Segmentation [27.367986520072147]
Unsupervised semantic segmentation is a challenging task that segments images into semantic groups without manual annotation.
We propose a novel approach called SmooSeg that harnesses self-supervised learning methods to model the closeness relationships among observations as smoothness signals.
Our SmooSeg significantly outperforms STEGO in terms of pixel accuracy on three datasets.
arXiv Detail & Related papers (2023-10-27T03:29:25Z) - Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos [63.94040814459116]
Self-supervised methods have shown remarkable progress in learning high-level semantics and low-level temporal correspondence.
We propose a novel semantic-aware masked slot attention on top of the fused semantic features and correspondence maps.
We adopt semantic- and instance-level temporal consistency as self-supervision to encourage temporally coherent object-centric representations.
arXiv Detail & Related papers (2023-08-19T09:12:13Z) - Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts.
We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query.
Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z) - Regional Semantic Contrast and Aggregation for Weakly Supervised
Semantic Segmentation [25.231470587575238]
We propose regional semantic contrast and aggregation (RCA) for learning semantic segmentation.
RCA is equipped with a regional memory bank to store massive, diverse object patterns appearing in training data.
RCA earns a strong capability of fine-grained semantic understanding, and eventually establishes new state-of-the-art results on two popular benchmarks.
arXiv Detail & Related papers (2022-03-17T23:29:03Z) - Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised
Semantic Segmentation [88.49669148290306]
We propose a novel weakly supervised multi-task framework called AuxSegNet to leverage saliency detection and multi-label image classification as auxiliary tasks.
Inspired by their similar structured semantics, we also propose to learn a cross-task global pixel-level affinity map from the saliency and segmentation representations.
The learned cross-task affinity can be used to refine saliency predictions and propagate CAM maps to provide improved pseudo labels for both tasks.
arXiv Detail & Related papers (2021-07-25T11:39:58Z) - Domain Adaptive Semantic Segmentation with Self-Supervised Depth
Estimation [84.34227665232281]
Domain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain.
We leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap.
We demonstrate the effectiveness of our proposed approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes.
arXiv Detail & Related papers (2021-04-28T07:47:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.