Scheduled Multi-task Learning for Neural Chat Translation
- URL: http://arxiv.org/abs/2205.03766v2
- Date: Tue, 10 May 2022 09:49:54 GMT
- Title: Scheduled Multi-task Learning for Neural Chat Translation
- Authors: Yunlong Liang, Fandong Meng, Jinan Xu, Yufeng Chen and Jie Zhou
- Abstract summary: We propose a scheduled multi-task learning framework for Neural Chat Translation (NCT)
Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training.
Extensive experiments in four language directions verify the effectiveness and superiority of the proposed approach.
- Score: 66.81525961469494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Chat Translation (NCT) aims to translate conversational text into
different languages. Existing methods mainly focus on modeling the bilingual
dialogue characteristics (e.g., coherence) to improve chat translation via
multi-task learning on small-scale chat translation data. Although the NCT
models have achieved impressive success, it is still far from satisfactory due
to insufficient chat translation data and simple joint training manners. To
address the above issues, we propose a scheduled multi-task learning framework
for NCT. Specifically, we devise a three-stage training framework to
incorporate the large-scale in-domain chat translation data into training by
adding a second pre-training stage between the original pre-training and
fine-tuning stages. Further, we investigate where and how to schedule the
dialogue-related auxiliary tasks in multiple training stages to effectively
enhance the main chat translation task. Extensive experiments in four language
directions (English-Chinese and English-German) verify the effectiveness and
superiority of the proposed approach. Additionally, we have made the
large-scale in-domain paired bilingual dialogue dataset publicly available to
the research community.
Related papers
- ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - A Multi-task Multi-stage Transitional Training Framework for Neural Chat
Translation [84.59697583372888]
Neural chat translation (NCT) aims to translate a cross-lingual chat between speakers of different languages.
Existing context-aware NMT models cannot achieve satisfactory performances due to limited resources of annotated bilingual dialogues.
We propose a multi-task multi-stage transitional (MMT) training framework, where an NCT model is trained using the bilingual chat translation dataset and additional monolingual dialogues.
arXiv Detail & Related papers (2023-01-27T14:41:16Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking [84.50302759362698]
We enhance the transfer learning process by intermediate fine-tuning of pretrained multilingual models.
We use parallel and conversational movie subtitles datasets to design cross-lingual intermediate tasks.
We achieve impressive improvements (> 20% on goal accuracy) on the parallel MultiWoZ dataset and Multilingual WoZ dataset.
arXiv Detail & Related papers (2021-09-28T11:22:38Z) - An Empirical Study of Cross-Lingual Transferability in Generative
Dialogue State Tracker [33.2309643963072]
We study the transferability of a cross-lingual generative dialogue state tracking system using a multilingual pre-trained seq2seq model.
We also find out the low cross-lingual transferability of our approaches and provides investigation and discussion.
arXiv Detail & Related papers (2021-01-27T12:45:55Z) - Multi-task Learning for Multilingual Neural Machine Translation [32.81785430242313]
We propose a multi-task learning framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data.
We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages.
arXiv Detail & Related papers (2020-10-06T06:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.