Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
- URL: http://arxiv.org/abs/2110.04482v1
- Date: Sat, 9 Oct 2021 07:00:38 GMT
- Title: Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
- Authors: Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang
- Abstract summary: This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system.
It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden.
- Score: 87.75833205560406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents a lifelong learning approach to train a multilingual
Text-To-Speech (TTS) system, where each language was seen as an individual task
and was learned sequentially and continually. It does not require pooled data
from all languages altogether, and thus alleviates the storage and computation
burden. One of the challenges of lifelong learning methods is "catastrophic
forgetting": in TTS scenario it means that model performance quickly degrades
on previous languages when adapted to a new language. We approach this problem
via a data-replay-based lifelong learning method. We formulate the replay
process as a supervised learning problem, and propose a simple yet effective
dual-sampler framework to tackle the heavily language-imbalanced training
samples. Through objective and subjective evaluations, we show that this
supervised learning formulation outperforms other gradient-based and
regularization-based lifelong learning methods, achieving 43% Mel-Cepstral
Distortion reduction compared to a fine-tuning baseline.
Related papers
- No Train but Gain: Language Arithmetic for training-free Language Adapters enhancement [59.37775534633868]
We introduce a novel method called language arithmetic, which enables training-free post-processing.
The effectiveness of the proposed solution is demonstrated on three downstream tasks in a MAD-X-based set of cross-lingual schemes.
arXiv Detail & Related papers (2024-04-24T08:52:40Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Overcoming Catastrophic Forgetting in Massively Multilingual Continual
Learning [34.034825754625935]
We study catastrophic forgetting, as well as methods to minimize this, in a massively multilingual continual learning framework involving up to 51 languages.
We present LR ADJUST, a learning rate scheduling method that is simple, yet effective in preserving new information without strongly overwriting past knowledge.
arXiv Detail & Related papers (2023-05-25T17:06:34Z) - Improving Temporal Generalization of Pre-trained Language Models with
Lexical Semantic Change [28.106524698188675]
Recent research has revealed that neural language models at scale suffer from poor temporal generalization capability.
We propose a simple yet effective lexical-level masking strategy to post-train a converged language model.
arXiv Detail & Related papers (2022-10-31T08:12:41Z) - Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding [55.989376102986654]
This paper studies a transferable phoneme embedding framework that aims to deal with the cross-lingual text-to-speech problem under the few-shot setting.
We propose a framework that consists of a phoneme-based TTS model and a codebook module to project phonemes from different languages into a learned latent space.
arXiv Detail & Related papers (2022-06-27T11:24:40Z) - Learning Natural Language Generation from Scratch [25.984828046001013]
This paper introduces TRUncated ReinForcement Learning for Language (TrufLL)
It is an original ap-proach to train conditional language models from scratch by only using reinforcement learning (RL)
arXiv Detail & Related papers (2021-09-20T08:46:51Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Meta-Learning with Sparse Experience Replay for Lifelong Language
Learning [26.296412053816233]
We propose a novel approach to lifelong learning of language tasks based on meta-learning with sparse experience replay.
We show that under the realistic setting of performing a single pass on a stream of tasks, our method obtains state-of-the-art results on lifelong text classification and relation extraction.
arXiv Detail & Related papers (2020-09-10T14:36:38Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.