APC: Adaptive Patch Contrast for Weakly Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2407.10649v1
- Date: Mon, 15 Jul 2024 12:10:05 GMT
- Title: APC: Adaptive Patch Contrast for Weakly Supervised Semantic Segmentation
- Authors: Wangyu Wu, Tianhong Dai, Zhenhong Chen, Xiaowei Huang, Fei Ma, Jimin Xiao,
- Abstract summary: Weakly Supervised Semantic dataset (WSSS) using only image-level labels has gained significant attention due to its cost-effectiveness.
Recent methods based on Vision Transformers (ViT) have demonstrated superior capabilities in generating reliable pseudo-labels.
We introduce a novel ViT-based WSSS method named textit Patch Contrast ( APC) that significantly enhances patch embedding learning.
- Score: 22.808117374130198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly Supervised Semantic Segmentation (WSSS) using only image-level labels has gained significant attention due to its cost-effectiveness. The typical framework involves using image-level labels as training data to generate pixel-level pseudo-labels with refinements. Recently, methods based on Vision Transformers (ViT) have demonstrated superior capabilities in generating reliable pseudo-labels, particularly in recognizing complete object regions, compared to CNN methods. However, current ViT-based approaches have some limitations in the use of patch embeddings, being prone to being dominated by certain abnormal patches, as well as many multi-stage methods being time-consuming and lengthy in training, thus lacking efficiency. Therefore, in this paper, we introduce a novel ViT-based WSSS method named \textit{Adaptive Patch Contrast} (APC) that significantly enhances patch embedding learning for improved segmentation effectiveness. APC utilizes an Adaptive-K Pooling (AKP) layer to address the limitations of previous max pooling selection methods. Additionally, we propose a Patch Contrastive Learning (PCL) to enhance patch embeddings, thereby further improving the final results. Furthermore, we improve upon the existing multi-stage training framework without CAM by transforming it into an end-to-end single-stage training approach, thereby enhancing training efficiency. The experimental results show that our approach is effective and efficient, outperforming other state-of-the-art WSSS methods on the PASCAL VOC 2012 and MS COCO 2014 dataset within a shorter training duration.
Related papers
- SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT.
Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework.
Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z) - DAWN: Domain-Adaptive Weakly Supervised Nuclei Segmentation via Cross-Task Interactions [17.68742587885609]
Current weakly supervised nuclei segmentation approaches follow a two-stage pseudo-label generation and network training process.
This paper introduces a novel domain-adaptive weakly supervised nuclei segmentation framework using cross-task interaction strategies.
To validate the effectiveness of our proposed method, we conduct extensive comparative and ablation experiments on six datasets.
arXiv Detail & Related papers (2024-04-23T12:01:21Z) - Top-K Pooling with Patch Contrastive Learning for Weakly-Supervised
Semantic Segmentation [25.628382644404066]
We introduce a novel ViT-based WSSS method named top-K pooling with patch contrastive learning (TKP-PCL)
A patch contrastive error (PCE) is also proposed to enhance the patch embeddings to further improve the final results.
Our approach is very efficient and outperforms other state-of-the-art WSSS methods on the PASCAL 2012 dataset.
arXiv Detail & Related papers (2023-10-15T13:19:59Z) - Boosting Weakly-Supervised Image Segmentation via Representation,
Transform, and Compensator [26.991314511807907]
Multi-stage training procedures have been widely used in existing WSIS approaches to obtain high-quality pseudo-masks as ground-truth.
We propose a novel single-stage WSIS method that utilizes a siamese network with contrastive learning to improve the quality of class activation maps (CAMs) and achieve a self-refinement process.
Our method significantly outperforms other state-of-the-art methods, achieving 67.2% and 68.76% mIoU on PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-09-02T09:07:25Z) - Contextual Squeeze-and-Excitation for Efficient Few-Shot Image
Classification [57.36281142038042]
We present a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance.
We also present a new training protocol based on Coordinate-Descent called UpperCaSE that exploits meta-trained CaSE blocks and fine-tuning routines for efficient adaptation.
arXiv Detail & Related papers (2022-06-20T15:25:08Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Adaptive Affinity Loss and Erroneous Pseudo-Label Refinement for Weakly
Supervised Semantic Segmentation [48.294903659573585]
In this paper, we propose to embed affinity learning of multi-stage approaches in a single-stage model.
A deep neural network is used to deliver comprehensive semantic information in the training phase.
Experiments are conducted on the PASCAL VOC 2012 dataset to evaluate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2021-08-03T07:48:33Z) - A Simple Baseline for Semi-supervised Semantic Segmentation with Strong
Data Augmentation [74.8791451327354]
We propose a simple yet effective semi-supervised learning framework for semantic segmentation.
A set of simple design and training techniques can collectively improve the performance of semi-supervised semantic segmentation significantly.
Our method achieves state-of-the-art results in the semi-supervised settings on the Cityscapes and Pascal VOC datasets.
arXiv Detail & Related papers (2021-04-15T06:01:39Z) - Selective Pseudo-Labeling with Reinforcement Learning for
Semi-Supervised Domain Adaptation [116.48885692054724]
We propose a reinforcement learning based selective pseudo-labeling method for semi-supervised domain adaptation.
We develop a deep Q-learning model to select both accurate and representative pseudo-labeled instances.
Our proposed method is evaluated on several benchmark datasets for SSDA, and demonstrates superior performance to all the comparison methods.
arXiv Detail & Related papers (2020-12-07T03:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.