Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic
Segmentation with Transformers
- URL: http://arxiv.org/abs/2203.02664v1
- Date: Sat, 5 Mar 2022 06:07:17 GMT
- Title: Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic
Segmentation with Transformers
- Authors: Lixiang Ru and Yibing Zhan and Baosheng Yu and Bo Du
- Abstract summary: We introduce Transformers, which naturally integrate global information, to generate more integral initial pseudo labels for end-to-end WSSS.
Motivated by the inherent consistency between the self-attention in Transformers and the semantic affinity, we propose an Affinity from Attention (AFA) module.
We also devise a Pixel-Adaptive Refinement module that incorporates low-level image appearance information to refine the pseudo labels.
- Score: 44.757309147148035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weakly-supervised semantic segmentation (WSSS) with image-level labels is an
important and challenging task. Due to the high training efficiency, end-to-end
solutions for WSSS have received increasing attention from the community.
However, current methods are mainly based on convolutional neural networks and
fail to explore the global information properly, thus usually resulting in
incomplete object regions. In this paper, to address the aforementioned
problem, we introduce Transformers, which naturally integrate global
information, to generate more integral initial pseudo labels for end-to-end
WSSS. Motivated by the inherent consistency between the self-attention in
Transformers and the semantic affinity, we propose an Affinity from Attention
(AFA) module to learn semantic affinity from the multi-head self-attention
(MHSA) in Transformers. The learned affinity is then leveraged to refine the
initial pseudo labels for segmentation. In addition, to efficiently derive
reliable affinity labels for supervising AFA and ensure the local consistency
of pseudo labels, we devise a Pixel-Adaptive Refinement module that
incorporates low-level image appearance information to refine the pseudo
labels. We perform extensive experiments and our method achieves 66.0% and
38.9% mIoU on the PASCAL VOC 2012 and MS COCO 2014 datasets, respectively,
significantly outperforming recent end-to-end methods and several multi-stage
competitors. Code is available at https://github.com/rulixiang/afa.
Related papers
- FedAnchor: Enhancing Federated Semi-Supervised Learning with Label
Contrastive Loss for Unlabeled Clients [19.3885479917635]
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices.
We propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server.
Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples.
arXiv Detail & Related papers (2024-02-15T18:48:21Z) - Progressive Feature Self-reinforcement for Weakly Supervised Semantic
Segmentation [55.69128107473125]
We propose a single-stage approach for Weakly Supervised Semantic (WSSS) with image-level labels.
We adaptively partition the image content into deterministic regions (e.g., confident foreground and background) and uncertain regions (e.g., object boundaries and misclassified categories) for separate processing.
Building upon this, we introduce a complementary self-enhancement method that constrains the semantic consistency between these confident regions and an augmented image with the same class labels.
arXiv Detail & Related papers (2023-12-14T13:21:52Z) - Semantic Connectivity-Driven Pseudo-labeling for Cross-domain
Segmentation [89.41179071022121]
Self-training is a prevailing approach in cross-domain semantic segmentation.
We propose a novel approach called Semantic Connectivity-driven pseudo-labeling.
This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics.
arXiv Detail & Related papers (2023-12-11T12:29:51Z) - STRAP: Structured Object Affordance Segmentation with Point Supervision [20.56373848741831]
We study affordance segmentation with point supervision, wherein the setting inherits an unexplored dual affinity-spatial affinity and label affinity.
We devise a dense prediction network that enhances label relations by effectively densifying labels in a new domain.
In experiments, we benchmark our method on the challenging CAD120 dataset, showing significant performance gains over prior methods.
arXiv Detail & Related papers (2023-04-17T17:59:49Z) - Self Correspondence Distillation for End-to-End Weakly-Supervised
Semantic Segmentation [13.623713806739271]
We propose a novel Self Correspondence Distillation (SCD) method to refine pseudo-labels without introducing external supervision.
In addition, we design a Variation-aware Refine Module to enhance the local consistency of pseudo-labels.
Our method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2023-02-27T13:46:40Z) - Continual Coarse-to-Fine Domain Adaptation in Semantic Segmentation [22.366638308792734]
Deep neural networks are typically trained in a single shot for a specific task and data distribution.
In real world settings both the task and the domain of application can change.
We introduce the novel task of coarse-to-fine learning of semantic segmentation architectures in presence of domain shift.
arXiv Detail & Related papers (2022-01-18T13:31:19Z) - GETAM: Gradient-weighted Element-wise Transformer Attention Map for
Weakly-supervised Semantic segmentation [29.184608129848105]
Class Activation Map (CAM) is usually generated to provide pixel level pseudo labels.
Transformer based methods are highly effective at exploring global context with long range dependency modeling.
GETAM shows fine scale activation for all feature map elements, revealing different parts of the object across transformer layers.
arXiv Detail & Related papers (2021-12-06T08:02:32Z) - Semi-Supervised Domain Adaptation with Prototypical Alignment and
Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled.
To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks.
Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z) - Dual-Refinement: Joint Label and Feature Refinement for Unsupervised
Domain Adaptive Person Re-Identification [51.98150752331922]
Unsupervised domain adaptive (UDA) person re-identification (re-ID) is a challenging task due to the missing of labels for the target domain data.
We propose a novel approach, called Dual-Refinement, that jointly refines pseudo labels at the off-line clustering phase and features at the on-line training phase.
Our method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-12-26T07:35:35Z) - PseudoSeg: Designing Pseudo Labels for Semantic Segmentation [78.35515004654553]
We present a re-design of pseudo-labeling to generate structured pseudo labels for training with unlabeled or weakly-labeled data.
We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.
arXiv Detail & Related papers (2020-10-19T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.