DARTS-ASR: Differentiable Architecture Search for Multilingual Speech
Recognition and Adaptation
- URL: http://arxiv.org/abs/2005.07029v2
- Date: Sun, 26 Jul 2020 02:40:53 GMT
- Title: DARTS-ASR: Differentiable Architecture Search for Multilingual Speech
Recognition and Adaptation
- Authors: Yi-Chen Chen, Jui-Yang Hsu, Cheng-Kuang Lee, Hung-yi Lee
- Abstract summary: In this paper, we propose an ASR approach with efficient gradient-based architecture search, DARTS-ASR.
In order to examine the generalizability of DARTS-ASR, we apply our approach not only on many languages to perform monolingual ASR, but also on a multilingual ASR setting.
- Score: 64.44349061520671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In previous works, only parameter weights of ASR models are optimized under
fixed-topology architecture. However, the design of successful model
architecture has always relied on human experience and intuition. Besides, many
hyperparameters related to model architecture need to be manually tuned.
Therefore in this paper, we propose an ASR approach with efficient
gradient-based architecture search, DARTS-ASR. In order to examine the
generalizability of DARTS-ASR, we apply our approach not only on many languages
to perform monolingual ASR, but also on a multilingual ASR setting. Following
previous works, we conducted experiments on a multilingual dataset, IARPA
BABEL. The experiment results show that our approach outperformed the baseline
fixed-topology architecture by 10.2% and 10.0% relative reduction on character
error rates under monolingual and multilingual ASR settings respectively.
Furthermore, we perform some analysis on the searched architectures by
DARTS-ASR.
Related papers
- RARe: Retrieval Augmented Retrieval with In-Context Examples [40.963703726988946]
We introduce a simple approach to enable retrievers to use in-context examples.
RARE finetunes a pre-trained model with in-context examples whose query is semantically similar to the target query.
We find RARe exhibits stronger out-of-domain generalization compared to models using queries without in-context examples.
arXiv Detail & Related papers (2024-10-26T05:46:20Z) - Massively Multilingual ASR on 70 Languages: Tokenization, Architecture,
and Generalization Capabilities [35.15674061731237]
This paper explores large-scale multilingual ASR models on 70 languages.
We show that our multilingual ASR generalizes well on an unseen dataset and domain, achieving 9.5% and 7.5% WER on Multilingual Librispeech (MLS) with zero-shot and finetuning, respectively.
arXiv Detail & Related papers (2022-11-10T18:43:42Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - Rethinking Architecture Selection in Differentiable NAS [74.61723678821049]
Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity.
We propose an alternative perturbation-based architecture selection that directly measures each operation's influence on the supernet.
We find that several failure modes of DARTS can be greatly alleviated with the proposed selection method.
arXiv Detail & Related papers (2021-08-10T00:53:39Z) - AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient
Pre-trained Language Models [46.69439585453071]
We adopt the one-shot Neural Architecture Search (NAS) to automatically search architecture hyper- parameters.
Specifically, we design the techniques of one-shot learning and the search space to provide an adaptive and efficient development way of tiny PLMs.
We name our method AutoTinyBERT and evaluate its effectiveness on the GLUE and SQuAD benchmarks.
arXiv Detail & Related papers (2021-07-29T00:47:30Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - BET: A Backtranslation Approach for Easy Data Augmentation in
Transformer-based Paraphrase Identification Context [0.0]
We call this approach BET by which we analyze the backtranslation data augmentation on the transformer-based architectures.
Our findings suggest that BET improves the paraphrase identification performance on the Microsoft Research Paraphrase Corpus to more than 3% on both accuracy and F1 score.
arXiv Detail & Related papers (2020-09-25T22:06:06Z) - AutoRC: Improving BERT Based Relation Classification Models via
Architecture Search [50.349407334562045]
BERT based relation classification (RC) models have achieved significant improvements over the traditional deep learning models.
No consensus can be reached on what is the optimal architecture.
We design a comprehensive search space for BERT based RC models and employ neural architecture search (NAS) method to automatically discover the design choices.
arXiv Detail & Related papers (2020-09-22T16:55:49Z) - Conversational Question Reformulation via Sequence-to-Sequence
Architectures and Pretrained Language Models [56.268862325167575]
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs)
We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task.
We evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task.
arXiv Detail & Related papers (2020-04-04T11:07:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.