Boosting Active Learning for Speech Recognition with Noisy
Pseudo-labeled Samples
- URL: http://arxiv.org/abs/2006.11021v2
- Date: Thu, 5 Nov 2020 14:41:47 GMT
- Title: Boosting Active Learning for Speech Recognition with Noisy
Pseudo-labeled Samples
- Authors: Jihwan Bang, Heesu Kim, YoungJoon Yoo, Jung-Woo Ha
- Abstract summary: We present a new training pipeline boosting the conventional active learning approach.
We show that the proposed training pipeline can boost the efficacy of active learning approaches.
- Score: 14.472052505918045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The cost of annotating transcriptions for large speech corpora becomes a
bottleneck to maximally enjoy the potential capacity of deep neural
network-based automatic speech recognition models. In this paper, we present a
new training pipeline boosting the conventional active learning approach
targeting label-efficient learning to resolve the mentioned problem. Existing
active learning methods only focus on selecting a set of informative samples
under a labeling budget. One step further, we suggest that the training
efficiency can be further improved by utilizing the unlabeled samples,
exceeding the labeling budget, by introducing sophisticatedly configured
unsupervised loss complementing supervised loss effectively. We propose new
unsupervised loss based on consistency regularization, and we configure
appropriate augmentation techniques for utterances to adopt consistency
regularization in the automatic speech recognition task. From the qualitative
and quantitative experiments on the real-world dataset and under real-usage
scenarios, we show that the proposed training pipeline can boost the efficacy
of active learning approaches, thus successfully reducing a sustainable amount
of human labeling cost.
Related papers
- Contrastive Augmentation: An Unsupervised Learning Approach for Keyword Spotting in Speech Technology [4.080686348274667]
We introduce a novel approach combining unsupervised contrastive learning and a augmentation unique-based technique.
Our method allows the neural network to train on unlabeled data sets, potentially improving performance in downstream tasks.
We present a speech augmentation-based unsupervised learning method that utilizes the similarity between the bottleneck layer feature and the audio reconstructing information.
arXiv Detail & Related papers (2024-08-31T05:40:37Z) - Feature Alignment: Rethinking Efficient Active Learning via Proxy in the
Context of Pre-trained Models [5.2976735459795385]
Fine-tuning the pre-trained model with active learning holds promise for reducing annotation costs.
Recent research has proposed proxy-based active learning, which pre-computes features to reduce computational costs.
This approach often incurs a significant loss in active learning performance, which may even outweigh the computational cost savings.
arXiv Detail & Related papers (2024-03-02T06:01:34Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - SURF: Semi-supervised Reward Learning with Data Augmentation for
Feedback-efficient Preference-based Reinforcement Learning [168.89470249446023]
We present SURF, a semi-supervised reward learning framework that utilizes a large amount of unlabeled samples with data augmentation.
In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.
Our experiments demonstrate that our approach significantly improves the feedback-efficiency of the preference-based method on a variety of locomotion and robotic manipulation tasks.
arXiv Detail & Related papers (2022-03-18T16:50:38Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Reducing Label Effort: Self-Supervised meets Active Learning [32.4747118398236]
Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets.
Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort.
The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.
arXiv Detail & Related papers (2021-08-25T20:04:44Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Cost-effective Variational Active Entity Resolution [4.238343046459798]
We devise an entity resolution method that builds on the robustness conferred by deep autoencoders to reduce human-involvement costs.
Specifically, we reduce the cost of training deep entity resolution models by performing unsupervised representation learning.
Finally, we reduce the cost of labelling training data through an active learning approach that builds on the properties conferred by the use of deep autoencoders.
arXiv Detail & Related papers (2020-11-20T13:47:11Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.