Semi-Supervised Speech Recognition via Graph-based Temporal
Classification
- URL: http://arxiv.org/abs/2010.15653v2
- Date: Tue, 16 Feb 2021 16:51:50 GMT
- Title: Semi-Supervised Speech Recognition via Graph-based Temporal
Classification
- Authors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
- Abstract summary: Semi-supervised learning has demonstrated promising results in automatic speech recognition by self-training.
The effectiveness of this approach largely relies on the pseudo-label accuracy.
Alternative ASR hypotheses of an N-best list can provide more accurate labels for an unlabeled speech utterance.
- Score: 59.58318952000571
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning has demonstrated promising results in automatic
speech recognition (ASR) by self-training using a seed ASR model with
pseudo-labels generated for unlabeled data. The effectiveness of this approach
largely relies on the pseudo-label accuracy, for which typically only the
1-best ASR hypothesis is used. However, alternative ASR hypotheses of an N-best
list can provide more accurate labels for an unlabeled speech utterance and
also reflect uncertainties of the seed ASR model. In this paper, we propose a
generalized form of the connectionist temporal classification (CTC) objective
that accepts a graph representation of the training labels. The newly proposed
graph-based temporal classification (GTC) objective is applied for
self-training with WFST-based supervision, which is generated from an N-best
list of pseudo-labels. In this setup, GTC is used to learn not only a temporal
alignment, similarly to CTC, but also a label alignment to obtain the optimal
pseudo-label sequence from the weighted graph. Results show that this approach
can effectively exploit an N-best list of pseudo-labels with associated scores,
considerably outperforming standard pseudo-labeling, with ASR results
approaching an oracle experiment in which the best hypotheses of the N-best
lists are selected manually.
Related papers
- Prompt-based Pseudo-labeling Strategy for Sample-Efficient Semi-Supervised Extractive Summarization [12.582774521907227]
Semi-supervised learning (SSL) is a widely used technique in scenarios where labeled data is scarce and unlabeled data is abundant.
Standard SSL methods follow a teacher-student paradigm to first train a classification model and then use the classifier's confidence values to select pseudo-labels.
We propose a prompt-based pseudo-labeling strategy with LLMs that picks unlabeled examples with more accurate pseudo-labels.
arXiv Detail & Related papers (2023-11-16T04:29:41Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for
Semi-Supervised Text Recognition [21.583569162994277]
One of the most popular SSL approaches is pseudo-labeling (PL)
PL methods are severely degraded by noise and are prone to over-fitting to noisy labels.
We propose a pseudo-label generation and an uncertainty-based data selection framework for text recognition.
arXiv Detail & Related papers (2022-08-31T02:21:02Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - A Unified Generative Adversarial Network Training via Self-Labeling and
Self-Attention [38.31735499785227]
We propose a novel GAN training scheme that can handle any level of labeling in a unified manner.
Our scheme introduces a form of artificial labeling that can incorporate manually defined labels, when available.
We evaluate our approach on CIFAR-10, STL-10 and SVHN, and show that both self-labeling and self-attention consistently improve the quality of generated data.
arXiv Detail & Related papers (2021-06-18T04:40:26Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - Cycle Self-Training for Domain Adaptation [85.14659717421533]
Cycle Self-Training (CST) is a principled self-training algorithm that enforces pseudo-labels to generalize across domains.
CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail.
Empirical results indicate that CST significantly improves over prior state-of-the-arts in standard UDA benchmarks.
arXiv Detail & Related papers (2021-03-05T10:04:25Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.