Joint Unsupervised and Supervised Training for Automatic Speech
Recognition via Bilevel Optimization
- URL: http://arxiv.org/abs/2401.06980v1
- Date: Sat, 13 Jan 2024 05:01:47 GMT
- Title: Joint Unsupervised and Supervised Training for Automatic Speech
Recognition via Bilevel Optimization
- Authors: A F M Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury,
Tianyi Chen
- Abstract summary: We present a novel bilevel optimization-based training approach to training acoustic models for automatic speech recognition (ASR) tasks that we term bi-level joint unsupervised and supervised training (BL-JUST).
BL-JUST employs a lower and upper level optimization with an unsupervised loss and a supervised loss respectively, leveraging recent advances in penalty-based bilevel optimization to solve this challenging ASR problem with affordable complexity and rigorous convergence guarantees.
- Score: 73.98386682604122
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a novel bilevel optimization-based training
approach to training acoustic models for automatic speech recognition (ASR)
tasks that we term {bi-level joint unsupervised and supervised training
(BL-JUST)}. {BL-JUST employs a lower and upper level optimization with an
unsupervised loss and a supervised loss respectively, leveraging recent
advances in penalty-based bilevel optimization to solve this challenging ASR
problem with affordable complexity and rigorous convergence guarantees.} To
evaluate BL-JUST, extensive experiments on the LibriSpeech and TED-LIUM v2
datasets have been conducted. BL-JUST achieves superior performance over the
commonly used pre-training followed by fine-tuning strategy.
Related papers
- BiSSL: Bilevel Optimization for Self-Supervised Pre-Training and Fine-Tuning [12.749627564482282]
BiSSL is a first-of-its-kind training framework that introduces bilevel optimization to enhance the alignment between the pretext pre-training and downstream fine-tuning stages in self-supervised learning.
We propose a training algorithm that alternates between optimizing the two objectives defined in BiSSL.
arXiv Detail & Related papers (2024-10-03T11:07:43Z) - A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints [66.61399765513383]
We develop a BLOCC algorithm to tackle BiLevel Optimization problems with Coupled Constraints.
We demonstrate its effectiveness on two well-known real-world applications.
arXiv Detail & Related papers (2024-06-14T15:59:36Z) - Asynchronous Distributed Bilevel Optimization [20.074079852690048]
We propose Asynchronous Distributed Bilevel (ADBO) algorithm to tackle bilevel optimization problems.
The complexity of ADBO to obtain the $epsilon$-stationary point is upper bounded by $mathcalO(frac1epsilon 2)$.
arXiv Detail & Related papers (2022-12-20T07:44:48Z) - Supervision-Guided Codebooks for Masked Prediction in Speech
Pre-training [102.14558233502514]
Masked prediction pre-training has seen remarkable progress in self-supervised learning (SSL) for speech recognition.
We propose two supervision-guided codebook generation approaches to improve automatic speech recognition (ASR) performance.
arXiv Detail & Related papers (2022-06-21T06:08:30Z) - Contextual Squeeze-and-Excitation for Efficient Few-Shot Image
Classification [57.36281142038042]
We present a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance.
We also present a new training protocol based on Coordinate-Descent called UpperCaSE that exploits meta-trained CaSE blocks and fine-tuning routines for efficient adaptation.
arXiv Detail & Related papers (2022-06-20T15:25:08Z) - Value-Function-based Sequential Minimization for Bi-level Optimization [52.39882976848064]
gradient-based Bi-Level Optimization (BLO) methods have been widely applied to handle modern learning tasks.
There are almost no gradient-based methods able to solve BLO in challenging scenarios, such as BLO with functional constraints and pessimistic BLO.
We provide Bi-level Value-Function-based Sequential Minimization (BVFSM) to address the above issues.
arXiv Detail & Related papers (2021-10-11T03:13:39Z) - Semi-Supervised Object Detection with Adaptive Class-Rebalancing
Self-Training [5.874575666947381]
This study delves into semi-supervised object detection to improve detector performance with additional unlabeled data.
We propose a novel two-stage filtering algorithm to generate accurate pseudo-labels.
Our method achieves satisfactory improvements on MS-COCO and VOC benchmarks.
arXiv Detail & Related papers (2021-07-11T12:14:42Z) - A Generic Descent Aggregation Framework for Gradient-based Bi-level
Optimization [41.894281911990554]
We develop a novel Bi-level Descent Aggregation (BDA) framework for bi-level learning tasks.
BDA aggregates hierarchical objectives of both upper level and lower level.
We propose a new proof recipe to improve the convergence results of conventional gradient-based bi-level methods.
arXiv Detail & Related papers (2021-02-16T06:58:12Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.