Multilingual Byte2Speech Text-To-Speech Models Are Few-shot Spoken
Language Learners
- URL: http://arxiv.org/abs/2103.03541v1
- Date: Fri, 5 Mar 2021 08:41:45 GMT
- Title: Multilingual Byte2Speech Text-To-Speech Models Are Few-shot Spoken
Language Learners
- Authors: Mutian He, Jingzhou Yang, Lei He
- Abstract summary: We present a multilingual end-to-end Text-To-Speech framework that maps byte inputs to spectrograms, thus allowing arbitrary input scripts.
The framework demonstrates capabilities to adapt to various new languages under extreme low-resource scenarios.
We propose a novel method to extract language-specific sub-networks for a better understanding of the mechanism of multilingual models.
- Score: 11.190877290770047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a multilingual end-to-end Text-To-Speech framework that maps byte
inputs to spectrograms, thus allowing arbitrary input scripts. Besides strong
results on 40+ languages, the framework demonstrates capabilities to adapt to
various new languages under extreme low-resource and even few-shot scenarios of
merely 40s transcribed recording without the need of lexicon, extra corpus,
auxiliary models, or particular linguistic expertise, while retains
satisfactory intelligibility and naturalness matching rich-resource models.
Exhaustive comparative studies are performed to reveal the potential of the
framework for low-resource application and the impact of various factors
contributory to adaptation. Furthermore, we propose a novel method to extract
language-specific sub-networks for a better understanding of the mechanism of
multilingual models.
Related papers
- Zero-shot Sentiment Analysis in Low-Resource Languages Using a
Multilingual Sentiment Lexicon [78.12363425794214]
We focus on zero-shot sentiment analysis tasks across 34 languages, including 6 high/medium-resource languages, 25 low-resource languages, and 3 code-switching datasets.
We demonstrate that pretraining using multilingual lexicons, without using any sentence-level sentiment data, achieves superior zero-shot performance compared to models fine-tuned on English sentiment datasets.
arXiv Detail & Related papers (2024-02-03T10:41:05Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Tokenization Impacts Multilingual Language Modeling: Assessing
Vocabulary Allocation and Overlap Across Languages [3.716965622352967]
We propose new criteria to evaluate the quality of lexical representation and vocabulary overlap observed in sub-word tokenizers.
Our findings show that the overlap of vocabulary across languages can be actually detrimental to certain downstream tasks.
arXiv Detail & Related papers (2023-05-26T18:06:49Z) - Cross-lingual Transfer for Speech Processing using Acoustic Language
Similarity [81.51206991542242]
Cross-lingual transfer offers a compelling way to help bridge this digital divide.
Current cross-lingual algorithms have shown success in text-based tasks and speech-related tasks over some low-resource languages.
We propose a language similarity approach that can efficiently identify acoustic cross-lingual transfer pairs across hundreds of languages.
arXiv Detail & Related papers (2021-11-02T01:55:17Z) - Exploring Teacher-Student Learning Approach for Multi-lingual
Speech-to-Intent Classification [73.5497360800395]
We develop an end-to-end system that supports multiple languages.
We exploit knowledge from a pre-trained multi-lingual natural language processing model.
arXiv Detail & Related papers (2021-09-28T04:43:11Z) - Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks.
For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.