SLaM: Student-Label Mixing for Distillation with Unlabeled Examples
- URL: http://arxiv.org/abs/2302.03806v2
- Date: Thu, 8 Jun 2023 18:00:24 GMT
- Title: SLaM: Student-Label Mixing for Distillation with Unlabeled Examples
- Authors: Vasilis Kontonis, Fotis Iliopoulos, Khoa Trinh, Cenk Baykal, Gaurav
Menghani, Erik Vee
- Abstract summary: We present a principled method for knowledge distillation with unlabeled examples that we call Student-Label Mixing (SLaM)
SLaM consistently improves over prior approaches by evaluating it on several standard benchmarks.
We give an algorithm improving the best-known sample complexity for learning halfspaces with margin under random classification noise.
- Score: 15.825078347452024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation with unlabeled examples is a powerful training
paradigm for generating compact and lightweight student models in applications
where the amount of labeled data is limited but one has access to a large pool
of unlabeled data. In this setting, a large teacher model generates ``soft''
pseudo-labels for the unlabeled dataset which are then used for training the
student model. Despite its success in a wide variety of applications, a
shortcoming of this approach is that the teacher's pseudo-labels are often
noisy, leading to impaired student performance. In this paper, we present a
principled method for knowledge distillation with unlabeled examples that we
call Student-Label Mixing (SLaM) and we show that it consistently improves over
prior approaches by evaluating it on several standard benchmarks. Finally, we
show that SLaM comes with theoretical guarantees; along the way we give an
algorithm improving the best-known sample complexity for learning halfspaces
with margin under random classification noise, and provide the first
convergence analysis for so-called ``forward loss-adjustment" methods.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Weighted Distillation with Unlabeled Examples [15.825078347452024]
Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited.
This paper proposes a principled approach for addressing this issue based on a ''debiasing'' reweighting of the student's loss function tailored to the distillation training paradigm.
arXiv Detail & Related papers (2022-10-13T04:08:56Z) - One Positive Label is Sufficient: Single-Positive Multi-Label Learning
with Label Enhancement [71.9401831465908]
We investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label.
A novel method named proposed, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed.
Experiments on benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-06-01T14:26:30Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as
Reference [153.354332374204]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
We first introduce a feature alignment objective between labeled and unlabeled data to capture potentially similar image pairs.
MITrans is shown to be a powerful knowledge module for further progressive refining features of unlabeled data.
Along with supervised learning for labeled data, the prediction of unlabeled data is jointly learned with the generated pseudo masks.
arXiv Detail & Related papers (2021-06-29T02:48:45Z) - Learning from Noisy Labels for Entity-Centric Information Extraction [17.50856935207308]
We propose a simple co-regularization framework for entity-centric information extraction.
These models are jointly optimized with task-specific loss, and are regularized to generate similar predictions.
In the end, we can take any of the trained models for inference.
arXiv Detail & Related papers (2021-04-17T22:49:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.