Building High-accuracy Multilingual ASR with Gated Language Experts and
Curriculum Training
- URL: http://arxiv.org/abs/2303.00786v2
- Date: Fri, 7 Jul 2023 23:48:04 GMT
- Title: Building High-accuracy Multilingual ASR with Gated Language Experts and
Curriculum Training
- Authors: Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue,
Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong
- Abstract summary: We propose gated language experts and curriculum training to enhance multilingual transformer transducer models.
Our method incorporates a gating mechanism and LID loss, enabling transformer experts to learn language-specific information.
- Score: 45.48362355283723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose gated language experts and curriculum training to enhance
multilingual transformer transducer models without requiring language
identification (LID) input from users during inference. Our method incorporates
a gating mechanism and LID loss, enabling transformer experts to learn
language-specific information. By combining gated transformer experts with
shared transformer layers, we construct multilingual transformer blocks and
utilize linear experts to effectively regularize the joint network. The
curriculum training scheme leverages LID to guide the gated experts in
improving their respective language performance. Experimental results on a
bilingual task involving English and Spanish demonstrate significant
improvements, with average relative word error reductions of 12.5% and 7.3%
compared to the baseline bilingual and monolingual models, respectively.
Notably, our method achieves performance comparable to the upper-bound model
trained and inferred with oracle LID. Extending our approach to trilingual,
quadrilingual, and pentalingual models reveals similar advantages to those
observed in the bilingual models, highlighting its ease of extension to
multiple languages.
Related papers
- 1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators? [46.43162333819418]
Large Language Models (LLMs) have garnered significant attention due to their remarkable ability to process information across various languages.
Despite their capabilities, they exhibit inconsistencies in handling identical queries in different languages, presenting challenges for further advancement.
This paper introduces a method to enhance the multilingual performance of LLMs by aggregating knowledge from diverse languages.
arXiv Detail & Related papers (2024-06-20T20:32:53Z) - Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions? [42.37657013017192]
We show that instruction-tuning on parallel instead of monolingual corpora benefits cross-lingual instruction following capabilities by up to 9.9%.
We also conduct a human annotation study to understand the alignment between human-based and GPT-4-based evaluation within multilingual chat scenarios.
arXiv Detail & Related papers (2024-02-21T11:07:07Z) - Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment [42.624862172666624]
We propose a simple yet effective cross-lingual alignment framework exploiting pairs of translation sentences.
It aligns the internal sentence representations across different languages via multilingual contrastive learning.
Experimental results show that even with less than 0.1 textperthousand of pre-training tokens, our alignment framework significantly boosts the cross-lingual abilities of generative language models.
arXiv Detail & Related papers (2023-11-14T11:24:08Z) - Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions [68.01449013641532]
Large-scale Pretrained Language Models (LLMs) have shown strong abilities in multilingual translations.
We present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation.
arXiv Detail & Related papers (2023-05-24T12:00:24Z) - Improved Self-Supervised Multilingual Speech Representation Learning
Combined with Auxiliary Language Information [21.250763472985824]
Self-supervised multilingual speech representation learning has shown success in improving the performance of multilingual automatic speech recognition.
However, similar to the supervised learning, multilingual pre-training may also suffer from language interference.
We introduce several techniques for improving self-supervised multilingual pre-training by leveraging auxiliary language information.
arXiv Detail & Related papers (2022-12-07T06:18:59Z) - High-resource Language-specific Training for Multilingual Neural Machine
Translation [109.31892935605192]
We propose the multilingual translation model with the high-resource language-specific training (HLT-MT) to alleviate the negative interference.
Specifically, we first train the multilingual model only with the high-resource pairs and select the language-specific modules at the top of the decoder.
HLT-MT is further trained on all available corpora to transfer knowledge from high-resource languages to low-resource languages.
arXiv Detail & Related papers (2022-07-11T14:33:13Z) - Exploring Teacher-Student Learning Approach for Multi-lingual
Speech-to-Intent Classification [73.5497360800395]
We develop an end-to-end system that supports multiple languages.
We exploit knowledge from a pre-trained multi-lingual natural language processing model.
arXiv Detail & Related papers (2021-09-28T04:43:11Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.