A Configurable Multilingual Model is All You Need to Recognize All
Languages
- URL: http://arxiv.org/abs/2107.05876v1
- Date: Tue, 13 Jul 2021 06:52:41 GMT
- Title: A Configurable Multilingual Model is All You Need to Recognize All
Languages
- Authors: Long Zhou, Jinyu Li, Eric Sun, Shujie Liu
- Abstract summary: We propose a novel multilingual model (CMM) which is trained only once but can be configured as different models based on users' choices.
CMM improves from the universal multilingual model by 26.4%, 16.9%, and 10.4% relative word error reduction when the user selects 1, 2, or 3 languages.
- Score: 52.274446882747455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual automatic speech recognition (ASR) models have shown great
promise in recent years because of the simplified model training and deployment
process. Conventional methods either train a universal multilingual model
without taking any language information or with a 1-hot language ID (LID)
vector to guide the recognition of the target language. In practice, the user
can be prompted to pre-select several languages he/she can speak. The
multilingual model without LID cannot well utilize the language information set
by the user while the multilingual model with LID can only handle one
pre-selected language. In this paper, we propose a novel configurable
multilingual model (CMM) which is trained only once but can be configured as
different models based on users' choices by extracting language-specific
modules together with a universal model from the trained CMM. Particularly, a
single CMM can be deployed to any user scenario where the users can pre-select
any combination of languages. Trained with 75K hours of transcribed anonymized
Microsoft multilingual data and evaluated with 10-language test sets, the
proposed CMM improves from the universal multilingual model by 26.0%, 16.9%,
and 10.4% relative word error reduction when the user selects 1, 2, or 3
languages, respectively. CMM also performs significantly better on
code-switching test sets.
Related papers
- Streaming Bilingual End-to-End ASR model using Attention over Multiple
Softmax [6.386371634323785]
We propose a novel bilingual end-to-end (E2E) modeling approach, where a single neural model can recognize both languages.
The proposed model has shared encoder and prediction networks, with language-specific joint networks that are combined via a self-attention mechanism.
arXiv Detail & Related papers (2024-01-22T01:44:42Z) - Adapting the adapters for code-switching in multilingual ASR [10.316724084739892]
Large pre-trained multilingual speech models have shown potential in scaling Automatic Speech Recognition to many low-resource languages.
Some of these models employ language adapters in their formulation, which helps to improve monolingual performance.
This formulation restricts the usability of these models on code-switched speech, where two languages are mixed together in the same utterance.
We propose ways to effectively fine-tune such models on code-switched speech, by assimilating information from both language adapters at each language adaptation point in the network.
arXiv Detail & Related papers (2023-10-11T12:15:24Z) - PolyLM: An Open Source Polyglot Large Language Model [57.64420154135178]
We present PolyLM, a multilingual large language model (LLMs) trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B.
To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30% in the first stage to 60% in the final stage during pre-training.
Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning.
arXiv Detail & Related papers (2023-07-12T09:00:37Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Cross-Lingual Text Classification with Multilingual Distillation and
Zero-Shot-Aware Training [21.934439663979663]
Multi-branch multilingual language model (MBLM) built on Multilingual pre-trained language models (MPLMs)
Method based on transferring knowledge from high-performance monolingual models with a teacher-student framework.
Results on two cross-lingual classification tasks show that, with only the task's supervised data used, our method improves both the supervised and zero-shot performance of MPLMs.
arXiv Detail & Related papers (2022-02-28T09:51:32Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - GLUECoS : An Evaluation Benchmark for Code-Switched NLP [17.066725832825423]
We present an evaluation benchmark, GLUECoS, for code-switched languages.
We present results on several NLP tasks in English-Hindi and English-Spanish.
We fine-tune multilingual models on artificially generated code-switched data.
arXiv Detail & Related papers (2020-04-26T13:28:34Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.