Related papers: Cycle Self-Training for Semi-Supervised Object Detection with Distribution Consistency Reweighting

Cycle Self-Training for Semi-Supervised Object Detection with Distribution Consistency Reweighting

URL: http://arxiv.org/abs/2207.05334v1
Date: Tue, 12 Jul 2022 06:16:48 GMT
Title: Cycle Self-Training for Semi-Supervised Object Detection with Distribution Consistency Reweighting
Authors: Hao Liu, Bin Chen, Bo Wang, Chunpeng Wu, Feng Dai, Peng Wu
Abstract summary: We propose a Cycle Self-Training (CST) framework for semi-supervised object detection (SSOD) CST consists of two teachers T1 and T2, two students S1 and S2 and based on these networks, a cycle self-training mechanism is built. Experiments prove the superiority of CST by consistently improving the AP over the baseline and outperforming state-of-the-art methods by 2.1% absolute AP improvements with scarce labeled data.
Score: 16.63549313217797
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, many semi-supervised object detection (SSOD) methods adopt teacher-student framework and have achieved state-of-the-art results. However, the teacher network is tightly coupled with the student network since the teacher is an exponential moving average (EMA) of the student, which causes a performance bottleneck. To address the coupling problem, we propose a Cycle Self-Training (CST) framework for SSOD, which consists of two teachers T1 and T2, two students S1 and S2. Based on these networks, a cycle self-training mechanism is built, i.e., S1${\rightarrow}$T1${\rightarrow}$S2${\rightarrow}$T2${\rightarrow}$S1. For S${\rightarrow}$T, we also utilize the EMA weights of the students to update the teachers. For T${\rightarrow}$S, instead of providing supervision for its own student S1(S2) directly, the teacher T1(T2) generates pseudo-labels for the student S2(S1), which looses the coupling effect. Moreover, owing to the property of EMA, the teacher is most likely to accumulate the biases from the student and make the mistakes irreversible. To mitigate the problem, we also propose a distribution consistency reweighting strategy, where pseudo-labels are reweighted based on distribution consistency across the teachers T1 and T2. With the strategy, the two students S2 and S1 can be trained robustly with noisy pseudo labels to avoid confirmation biases. Extensive experiments prove the superiority of CST by consistently improving the AP over the baseline and outperforming state-of-the-art methods by 2.1% absolute AP improvements with scarce labeled data.

Related papers

Does Weak-to-strong Generalization Happen under Spurious Correlations? [17.02943058643617]
Key problem in weak-to-strong (W2S) generalization: when fine-tuning a strong pre-trained student with pseudolabels from a weaker teacher on a downstream task with spurious correlations, does W2S happen, and how to improve it upon failures?<n>We consider two sources of spurious correlations caused by group imbalance: (i) a weak teacher fine-tuned on group-imbalanced labeled data with a minority group of fraction $eta_ell$, and (ii) a group-imbalanced unlabeled set pseudolabeled by teacher with minority fraction $eta_u$.
arXiv Detail & Related papers (2025-09-28T17:57:49Z)
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via $α$-$β$-Divergence [89.630486749083]
Knowledge Distillation (KD) transfers knowledge from a large teacher model to a smaller student model.<n>The core challenge in KD lies in balancing two mode-concentration effects.<n>We propose ABKD, a generic framework with $alpha$$beta$-divergence.
arXiv Detail & Related papers (2025-05-07T16:48:49Z)
Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization [69.96794098855938]
Weak-to-strong generalization (W2SG) offers a promising framework for supervising increasingly capable language models (LLMs)<n>Traditional W2SG methods rely on passive learning, where a weak teacher provides noisy demonstrations to train a strong student.<n>We introduce Alice, a framework that leverages complementary knowledge between teacher and student to enhance the learning process.
arXiv Detail & Related papers (2025-04-09T22:33:06Z)
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios [3.818273633647809]
We propose a three-component framework leveraging three signal types. The first signal is the student's self-consistency (consistency of student multiple outputs), which is a proxy of the student's confidence. We show that our proposed two-stage framework brings a relative improvement of up to 20.79% compared to fine-tuning without any signals across datasets.
arXiv Detail & Related papers (2024-06-08T02:17:43Z)
Switching Temporary Teachers for Semi-Supervised Semantic Segmentation [45.20519672287495]
The teacher-student framework, prevalent in semi-supervised semantic segmentation, mainly employs the exponential moving average (EMA) to update a single teacher's weights based on the student's. This paper introduces Dual Teacher, a simple yet effective approach that employs dual temporary teachers aiming to alleviate the coupling problem for the student.
arXiv Detail & Related papers (2023-10-28T08:49:16Z)
Towards End-to-end Semi-supervised Learning for One-stage Object Detection [88.56917845580594]
This paper focuses on the semi-supervised learning for the advanced and popular one-stage detection network YOLOv5. We propose a novel teacher-student learning recipe called OneTeacher with two innovative designs, namely Multi-view Pseudo-label Refinement (MPR) and Decoupled Semi-supervised Optimization (DSO) In particular, MPR improves the quality of pseudo-labels via augmented-view refinement and global-view filtering, and DSO handles the joint optimization conflicts via structure tweaks and task-specific pseudo-labeling.
arXiv Detail & Related papers (2023-02-22T11:35:40Z)
Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models. In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks. Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z)
DTG-SSOD: Dense Teacher Guidance for Semi-Supervised Object Detection [42.75316070378037]
Mean-Teacher (MT) scheme is widely adopted in semi-supervised object detection (SSOD) In this paper, we propose the Inverse NMS Clustering (INC) and Rank Matching (RM) to instantiate the dense supervision.
arXiv Detail & Related papers (2022-07-12T13:54:54Z)
Label Matching Semi-Supervised Object Detection [85.99282969977541]
Semi-supervised object detection has made significant progress with the development of mean teacher driven self-training. Label mismatch problem is not yet fully explored in the previous works, leading to severe confirmation bias during self-training. We propose a simple yet effective LabelMatch framework from two different yet complementary perspectives.
arXiv Detail & Related papers (2022-06-14T05:59:41Z)
Adversarial Dual-Student with Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation. We also propose a novel adversarial dual-student framework to improve the Mean-Teacher. Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z)
Graph Consistency based Mean-Teaching for Unsupervised Domain Adaptive Person Re-Identification [54.58165777717885]
This paper proposes a Graph Consistency based Mean-Teaching (GCMT) method with constructing the Graph Consistency Constraint (GCC) between teacher and student networks. Experiments on three datasets, i.e., Market-1501, DukeMTMCreID, and MSMT17, show that proposed GCMT outperforms state-of-the-art methods by clear margin.
arXiv Detail & Related papers (2021-05-11T04:09:49Z)
Unbiased Teacher for Semi-Supervised Object Detection [50.0087227400306]
We revisit the Semi-Supervised Object Detection (SS-OD) and identify the pseudo-labeling bias issue in SS-OD. We introduce Unbiased Teacher, a simple yet effective approach that jointly trains a student and a gradually progressing teacher in a mutually-beneficial manner.
arXiv Detail & Related papers (2021-02-18T17:02:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.