Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection
- URL: http://arxiv.org/abs/2007.06144v3
- Date: Wed, 2 Sep 2020 09:26:25 GMT
- Title: Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection
- Authors: Cong Chen and Shouyang Dong and Ye Tian and Kunlin Cao and Li Liu and
Yuanhao Guo
- Abstract summary: This paper focuses on Semi-supervisedd Object Detection (SSOD)
The teacher model serves a dual role as a teacher and a student.
The class imbalance issue in SSOD hinders an efficient knowledge transfer from teacher to student.
- Score: 9.64328205496046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper focuses on Semi-Supervised Object Detection (SSOD). Knowledge
Distillation (KD) has been widely used for semi-supervised image
classification. However, adapting these methods for SSOD has the following
obstacles. (1) The teacher model serves a dual role as a teacher and a student,
such that the teacher predictions on unlabeled images may be very close to
those of student, which limits the upper-bound of the student. (2) The class
imbalance issue in SSOD hinders an efficient knowledge transfer from teacher to
student. To address these problems, we propose a novel method Temporal
Self-Ensembling Teacher (TSE-T) for SSOD. Differently from previous KD based
methods, we devise a temporally evolved teacher model. First, our teacher model
ensembles its temporal predictions for unlabeled images under stochastic
perturbations. Second, our teacher model ensembles its temporal model weights
with the student model weights by an exponential moving average (EMA) which
allows the teacher gradually learn from the student. These self-ensembling
strategies increase data and model diversity, thus improving teacher
predictions on unlabeled images. Finally, we use focal loss to formulate
consistency regularization term to handle the data imbalance problem, which is
a more efficient manner to utilize the useful information from unlabeled images
than a simple hard-thresholding method which solely preserves confident
predictions. Evaluated on the widely used VOC and COCO benchmarks, the mAP of
our method has achieved 80.73% and 40.52% on the VOC2007 test set and the
COCO2014 minval5k set respectively, which outperforms a strong fully-supervised
detector by 2.37% and 1.49%. Furthermore, our method sets the new
state-of-the-art in SSOD on VOC2007 test set which outperforms the baseline
SSOD method by 1.44%. The source code of this work is publicly available at
http://github.com/syangdong/tse-t.
Related papers
- Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling [81.00825302340984]
We introduce Speculative Knowledge Distillation (SKD) to generate high-quality training data on-the-fly.
In SKD, the student proposes tokens, and the teacher replaces poorly ranked ones based on its own distribution.
We evaluate SKD on various text generation tasks, including translation, summarization, math, and instruction following.
arXiv Detail & Related papers (2024-10-15T06:51:25Z) - Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions [65.0109231252639]
Recent studies on low-light pose estimation require the use of paired well-lit and low-light images with ground truths for training.
Our primary novelty lies in leveraging two complementary-teacher networks to generate more reliable pseudo labels.
Our method achieves 6.8% (2.4 AP) improvement over the state-of-the-art (SOTA) method.
arXiv Detail & Related papers (2024-07-22T08:09:14Z) - Improve Knowledge Distillation via Label Revision and Data Selection [37.74822443555646]
This paper proposes to rectify the teacher's inaccurate predictions using the ground truth.
In the latter, we introduce a data selection technique to choose suitable training samples to be supervised by the teacher.
Experiment results demonstrate the effectiveness of our proposed method, and show that our method can be combined with other distillation approaches.
arXiv Detail & Related papers (2024-04-03T02:41:16Z) - Periodically Exchange Teacher-Student for Source-Free Object Detection [7.222926042027062]
Source-free object detection (SFOD) aims to adapt the source detector to unlabeled target domain data in the absence of source domain data.
Most SFOD methods follow the same self-training paradigm using mean-teacher (MT) framework where the student model is guided by only one single teacher model.
We propose the Periodically Exchange Teacher-Student (PETS) method, a simple yet novel approach that introduces a multiple-teacher framework consisting of a static teacher, a dynamic teacher, and a student model.
arXiv Detail & Related papers (2023-11-23T11:30:54Z) - Switching Temporary Teachers for Semi-Supervised Semantic Segmentation [45.20519672287495]
The teacher-student framework, prevalent in semi-supervised semantic segmentation, mainly employs the exponential moving average (EMA) to update a single teacher's weights based on the student's.
This paper introduces Dual Teacher, a simple yet effective approach that employs dual temporary teachers aiming to alleviate the coupling problem for the student.
arXiv Detail & Related papers (2023-10-28T08:49:16Z) - Source-Free Domain Adaptive Fundus Image Segmentation with
Class-Balanced Mean Teacher [37.72463382440212]
This paper studies source-free domain adaptive fundus image segmentation.
It aims to adapt a pretrained fundus segmentation model to a target domain using unlabeled images.
arXiv Detail & Related papers (2023-07-14T09:26:19Z) - Semi-Supervised 2D Human Pose Estimation Driven by Position
Inconsistency Pseudo Label Correction Module [74.80776648785897]
The previous method ignored two problems: (i) When conducting interactive training between large model and lightweight model, the pseudo label of lightweight model will be used to guide large models.
We propose a semi-supervised 2D human pose estimation framework driven by a position inconsistency pseudo label correction module (SSPCM)
To further improve the performance of the student model, we use the semi-supervised Cut-Occlude based on pseudo keypoint perception to generate more hard and effective samples.
arXiv Detail & Related papers (2023-03-08T02:57:05Z) - Distantly-Supervised Named Entity Recognition with Adaptive Teacher
Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models.
In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks.
Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z) - Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge
Distillation [70.92135839545314]
We propose the dynamic prior knowledge (DPK), which integrates part of teacher's features as the prior knowledge before the feature distillation.
Our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.
arXiv Detail & Related papers (2022-06-13T11:52:13Z) - Unbiased Teacher for Semi-Supervised Object Detection [50.0087227400306]
We revisit the Semi-Supervised Object Detection (SS-OD) and identify the pseudo-labeling bias issue in SS-OD.
We introduce Unbiased Teacher, a simple yet effective approach that jointly trains a student and a gradually progressing teacher in a mutually-beneficial manner.
arXiv Detail & Related papers (2021-02-18T17:02:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.