Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
Recognition
- URL: http://arxiv.org/abs/2308.06547v1
- Date: Sat, 12 Aug 2023 12:13:52 GMT
- Title: Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
Recognition
- Authors: Han Zhu, Dongji Gao, Gaofeng Cheng, Daniel Povey, Pengyuan Zhang,
Yonghong Yan
- Abstract summary: When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition.
Taking noisy labels as ground-truth in the loss function results in suboptimal performance.
We propose a novel framework named alternative pseudo-labeling to tackle the issue of noisy pseudo-labels.
- Score: 49.42732949233184
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When labeled data is insufficient, semi-supervised learning with the
pseudo-labeling technique can significantly improve the performance of
automatic speech recognition. However, pseudo-labels are often noisy,
containing numerous incorrect tokens. Taking noisy labels as ground-truth in
the loss function results in suboptimal performance. Previous works attempted
to mitigate this issue by either filtering out the nosiest pseudo-labels or
improving the overall quality of pseudo-labels. While these methods are
effective to some extent, it is unrealistic to entirely eliminate incorrect
tokens in pseudo-labels. In this work, we propose a novel framework named
alternative pseudo-labeling to tackle the issue of noisy pseudo-labels from the
perspective of the training objective. The framework comprises several
components. Firstly, a generalized CTC loss function is introduced to handle
noisy pseudo-labels by accepting alternative tokens in the positions of
incorrect tokens. Applying this loss function in pseudo-labeling requires
detecting incorrect tokens in the predicted pseudo-labels. In this work, we
adopt a confidence-based error detection method that identifies the incorrect
tokens by comparing their confidence scores with a given threshold, thus
necessitating the confidence score to be discriminative. Hence, the second
proposed technique is the contrastive CTC loss function that widens the
confidence gap between the correctly and incorrectly predicted tokens, thereby
improving the error detection ability. Additionally, obtaining satisfactory
performance with confidence-based error detection typically requires extensive
threshold tuning. Instead, we propose an automatic thresholding method that
uses labeled data as a proxy for determining the threshold, thus saving the
pain of manual tuning.
Related papers
- InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions [5.50485371072671]
Our method improves the recognition accuracy of misrecognized target keywords by substituting intermediate CTC predictions with corrected labels.
Experiments conducted in Japanese demonstrated that our method successfully improved the F1 score for unknown words.
arXiv Detail & Related papers (2024-06-21T06:25:10Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Filter and evolve: progressive pseudo label refining for semi-supervised
automatic speech recognition [5.735000563764309]
Low quality pseudo labels can misguide decision boundaries and degrade performance.
We propose a simple yet effective strategy to filter low quality pseudo labels.
Experiments on LibriSpeech show that these filtered samples enable the refined model to yield more correct predictions.
arXiv Detail & Related papers (2022-10-28T16:15:58Z) - Pseudo-Label Noise Suppression Techniques for Semi-Supervised Semantic
Segmentation [21.163070161951868]
Semi-consuming learning (SSL) can reduce the need for large labelled datasets by incorporating unsupervised data into the training.
Current SSL approaches use an initially supervised trained model to generate predictions for unlabelled images, called pseudo-labels.
We use three mechanisms to control pseudo-label noise and errors.
arXiv Detail & Related papers (2022-10-19T09:46:27Z) - Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for
Semi-Supervised Text Recognition [21.583569162994277]
One of the most popular SSL approaches is pseudo-labeling (PL)
PL methods are severely degraded by noise and are prone to over-fitting to noisy labels.
We propose a pseudo-label generation and an uncertainty-based data selection framework for text recognition.
arXiv Detail & Related papers (2022-08-31T02:21:02Z) - Two Wrongs Don't Make a Right: Combating Confirmation Bias in Learning
with Label Noise [6.303101074386922]
Robust Label Refurbishment (Robust LR) is a new hybrid method that integrates pseudo-labeling and confidence estimation techniques to refurbish noisy labels.
We show that our method successfully alleviates the damage of both label noise and confirmation bias.
For example, Robust LR achieves up to 4.5% absolute top-1 accuracy improvement over the previous best on the real-world noisy dataset WebVision.
arXiv Detail & Related papers (2021-12-06T12:10:17Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z) - Learning to Purify Noisy Labels via Meta Soft Label Corrector [49.92310583232323]
Recent deep neural networks (DNNs) can easily overfit to biased training data with noisy labels.
Label correction strategy is commonly used to alleviate this issue.
We propose a meta-learning model which could estimate soft labels through meta-gradient descent step.
arXiv Detail & Related papers (2020-08-03T03:25:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.