Weakly Supervised Semantic Segmentation via Alternative Self-Dual
Teaching
- URL: http://arxiv.org/abs/2112.09459v1
- Date: Fri, 17 Dec 2021 11:56:56 GMT
- Title: Weakly Supervised Semantic Segmentation via Alternative Self-Dual
Teaching
- Authors: Dingwen Zhang, Wenyuan Zeng, Guangyu Guo, Chaowei Fang, Lechao Cheng,
Junwei Han
- Abstract summary: This paper establishes a compact learning framework that embeds the classification and mask-refinement components into a unified deep model.
We propose a novel alternative self-dual teaching (ASDT) mechanism to encourage high-quality knowledge interaction.
- Score: 82.71578668091914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current weakly supervised semantic segmentation (WSSS) frameworks usually
contain the separated mask-refinement model and the main semantic region mining
model. These approaches would contain redundant feature extraction backbones
and biased learning objectives, making them computational complex yet
sub-optimal to addressing the WSSS task. To solve this problem, this paper
establishes a compact learning framework that embeds the classification and
mask-refinement components into a unified deep model. With the shared feature
extraction backbone, our model is able to facilitate knowledge sharing between
the two components while preserving a low computational complexity. To
encourage high-quality knowledge interaction, we propose a novel alternative
self-dual teaching (ASDT) mechanism. Unlike the conventional distillation
strategy, the knowledge of the two teacher branches in our model is
alternatively distilled to the student branch by a Pulse Width Modulation
(PWM), which generates PW wave-like selection signal to guide the knowledge
distillation process. In this way, the student branch can help prevent the
model from falling into local minimum solutions caused by the imperfect
knowledge provided of either teacher branch. Comprehensive experiments on the
PASCAL VOC 2012 and COCO-Stuff 10K demonstrate the effectiveness of the
proposed alternative self-dual teaching mechanism as well as the new
state-of-the-art performance of our approach.
Related papers
- DFMSD: Dual Feature Masking Stage-wise Knowledge Distillation for Object Detection [6.371066478190595]
A novel dual feature-masking heterogeneous distillation framework termed DFMSD is proposed for object detection.
A masking enhancement strategy is combined with stage-wise learning to improve feature-masking reconstruction.
Experiments for the object detection task demonstrate the promise of our approach.
arXiv Detail & Related papers (2024-07-18T04:19:14Z) - Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation [16.35460453348319]
This paper introduces a self-knowledge distillation (SKD) method that guides the frame-level alignment during the training time.
Overall, our approach is effective in improving both the resource efficiency as well as performance.
arXiv Detail & Related papers (2024-06-12T06:22:52Z) - Mutual Distillation Learning For Person Re-Identification [27.350415735863184]
We propose a novel approach, Mutual Distillation Learning For Person Re-identification (termed as MDPR)
Our approach encompasses two branches: a hard content branch to extract local features via a uniform horizontal partitioning strategy and a Soft Content Branch to dynamically distinguish between foreground and background.
Our method achieves an impressive $88.7%/94.4%$ in mAP/Rank-1 on the DukeC-reID dataset, surpassing the current state-of-the-art results.
arXiv Detail & Related papers (2024-01-12T07:49:02Z) - CORSD: Class-Oriented Relational Self Distillation [16.11986532440837]
Knowledge distillation conducts an effective model compression method while holding some limitations.
We propose a novel training framework named Class-Oriented Self Distillation (CORSD) to address the limitations.
arXiv Detail & Related papers (2023-04-28T16:00:31Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning What Not to Segment: A New Perspective on Few-Shot Segmentation [63.910211095033596]
Recently few-shot segmentation (FSS) has been extensively developed.
This paper proposes a fresh and straightforward insight to alleviate the problem.
In light of the unique nature of the proposed approach, we also extend it to a more realistic but challenging setting.
arXiv Detail & Related papers (2022-03-15T03:08:27Z) - Self-Feature Regularization: Self-Feature Distillation Without Teacher
Models [0.0]
Self-Feature Regularization(SFR) is proposed, which uses features in the deep layers to supervise feature learning in the shallow layers.
We firstly use generalization-l2 loss to match local features and a many-to-one approach to distill more intensively in the channel dimension.
arXiv Detail & Related papers (2021-03-12T15:29:00Z) - Knowledge Distillation Meets Self-Supervision [109.6400639148393]
Knowledge distillation involves extracting "dark knowledge" from a teacher network to guide the learning of a student network.
We show that the seemingly different self-supervision task can serve as a simple yet powerful solution.
By exploiting the similarity between those self-supervision signals as an auxiliary task, one can effectively transfer the hidden information from the teacher to the student.
arXiv Detail & Related papers (2020-06-12T12:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.