Related papers: Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation

Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation

URL: http://arxiv.org/abs/2402.17891v2
Date: Tue, 9 Jul 2024 18:54:17 GMT
Title: Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation
Authors: Xinyu Yang, Hossein Rahmani, Sue Black, Bryan M. Williams,
Abstract summary: Class activation maps (CAMs) are commonly employed in weakly supervised semantic segmentation (WSSS) to produce pseudo-labels. We propose an end-to-end WSSS model incorporating guided CAMs, wherein our segmentation model is trained while concurrently optimizing CAMs online. CoSA is the first single-stage approach to outperform all existing multi-stage methods including those with additional supervision.
Score: 21.345548821276097
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Class activation maps (CAMs) are commonly employed in weakly supervised semantic segmentation (WSSS) to produce pseudo-labels. Due to incomplete or excessive class activation, existing studies often resort to offline CAM refinement, introducing additional stages or proposing offline modules. This can cause optimization difficulties for single-stage methods and limit generalizability. In this study, we aim to reduce the observed CAM inconsistency and error to mitigate reliance on refinement processes. We propose an end-to-end WSSS model incorporating guided CAMs, wherein our segmentation model is trained while concurrently optimizing CAMs online. Our method, Co-training with Swapping Assignments (CoSA), leverages a dual-stream framework, where one sub-network learns from the swapped assignments generated by the other. We introduce three techniques: i) soft perplexity-based regularization to penalize uncertain regions; ii) a threshold-searching approach to dynamically revise the confidence threshold; and iii) contrastive separation to address the coexistence problem. CoSA demonstrates exceptional performance, achieving mIoU of 76.2\% and 51.0\% on VOC and COCO validation datasets, respectively, surpassing existing baselines by a substantial margin. Notably, CoSA is the first single-stage approach to outperform all existing multi-stage methods including those with additional supervision. Code is avilable at \url{https://github.com/youshyee/CoSA}.

Related papers

CKAA: Cross-subspace Knowledge Alignment and Aggregation for Robust Continual Learning [80.18781219542016]
Continual Learning (CL) empowers AI models to continuously learn from sequential task streams.<n>Recent parameter-efficient fine-tuning (PEFT)-based CL methods have garnered increasing attention due to their superior performance.<n>We propose Cross-subspace Knowledge Alignment and Aggregation (CKAA) to enhance robustness against misleading task-ids.
arXiv Detail & Related papers (2025-07-13T03:11:35Z)
Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning [59.6658995479243]
We propose texttext-Perturb-and-Merge (P&M), a novel continual learning framework that integrates model merging into the CL paradigm to avoid forgetting.<n>Through theoretical analysis, we minimize the total loss increase across all tasks and derive an analytical solution for the optimal merging coefficient.<n>Our proposed approach achieves state-of-the-art performance on several continual learning benchmark datasets.
arXiv Detail & Related papers (2025-05-28T14:14:19Z)
Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach [7.012760526318993]
Weakly-Supervised Semantic (WSSS) offers a cost-efficient workaround to extensive labeling. Existing WSSS methods have difficulties in learning the boundaries of objects leading to poor segmentation results. We propose a novel and effective framework that addresses these issues by leveraging visual foundation models inside the bounding box.
arXiv Detail & Related papers (2024-05-10T16:42:25Z)
Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet) AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition. Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z)
CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen [62.72636247006293]
Class-Agnostic Structure-Constrained Learning is a plug-in framework that can integrate with existing methods. We propose soft assignment and mask split methodologies that enhance OOD object segmentation. Empirical evaluations demonstrate CSL's prowess in boosting the performance of existing algorithms spanning OOD segmentation, ZS3, and DA segmentation.
arXiv Detail & Related papers (2023-12-09T11:06:18Z)
Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation [34.97083511196799]
Semi-supervised semantic segmentation (SSS) has recently gained increasing research interest. Current methods often suffer from the confirmation bias from the pseudo-labelling process. We propose a new conflict-based cross-view consistency (CCVC) method based on a two-branch co-training framework.
arXiv Detail & Related papers (2023-03-02T14:02:16Z)
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality. Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z)
Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling. This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data. We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z)
Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains. We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z)
Bottom-Up Temporal Action Localization with Mutual Regularization [107.39785866001868]
State-of-the-art solutions for TAL involve evaluating the frame-level probabilities of three action-indicating phases. We introduce two regularization terms to mutually regularize the learning procedure. Experiments are performed on two popular TAL datasets, THUMOS14 and ActivityNet1.3.
arXiv Detail & Related papers (2020-02-18T03:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.