Modeling Bilingual Conversational Characteristics for Neural Chat
Translation
- URL: http://arxiv.org/abs/2107.11164v1
- Date: Fri, 23 Jul 2021 12:23:34 GMT
- Title: Modeling Bilingual Conversational Characteristics for Neural Chat
Translation
- Authors: Yunlong Liang, Fandong Meng, Yufeng Chen, Jinan Xu and Jie Zhou
- Abstract summary: We aim to promote the translation quality of conversational text by modeling the above properties.
We evaluate our approach on the benchmark dataset BConTrasT (English-German) and a self-collected bilingual dialogue corpus, named BMELD (English-Chinese)
Our approach notably boosts the performance over strong baselines by a large margin and significantly surpasses some state-of-the-art context-aware NMT models in terms of BLEU and TER.
- Score: 24.94474722693084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural chat translation aims to translate bilingual conversational text,
which has a broad application in international exchanges and cooperation.
Despite the impressive performance of sentence-level and context-aware Neural
Machine Translation (NMT), there still remain challenges to translate bilingual
conversational text due to its inherent characteristics such as role
preference, dialogue coherence, and translation consistency. In this paper, we
aim to promote the translation quality of conversational text by modeling the
above properties. Specifically, we design three latent variational modules to
learn the distributions of bilingual conversational characteristics. Through
sampling from these learned distributions, the latent variables, tailored for
role preference, dialogue coherence, and translation consistency, are
incorporated into the NMT model for better translation. We evaluate our
approach on the benchmark dataset BConTrasT (English-German) and a
self-collected bilingual dialogue corpus, named BMELD (English-Chinese).
Extensive experiments show that our approach notably boosts the performance
over strong baselines by a large margin and significantly surpasses some
state-of-the-art context-aware NMT models in terms of BLEU and TER.
Additionally, we make the BMELD dataset publicly available for the research
community.
Related papers
- Context-aware Neural Machine Translation for English-Japanese Business
Scene Dialogues [14.043741721036543]
This paper explores how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation.
We propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type.
We find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation.
arXiv Detail & Related papers (2023-11-20T18:06:03Z) - MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by
Leveraging Unstructured Context in Neural Machine Translation [3.703767478524629]
This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text.
MTCue learns an abstract representation of context, enabling transferability across different data settings.
MTCue significantly outperforms a "tagging" baseline at translating English text.
arXiv Detail & Related papers (2023-05-25T10:06:08Z) - Is Translation Helpful? An Empirical Analysis of Cross-Lingual Transfer
in Low-Resource Dialog Generation [21.973937517854935]
Cross-lingual transfer is important for developing high-quality chatbots in multiple languages.
In this work, we investigate whether it is helpful to utilize machine translation (MT) at all in this task.
Experiments show that leveraging English dialog corpora can indeed improve the naturalness, relevance and cross-domain transferability in Chinese.
arXiv Detail & Related papers (2023-05-21T15:07:04Z) - Discourse Centric Evaluation of Machine Translation with a Densely
Annotated Parallel Corpus [82.07304301996562]
This paper presents a new dataset with rich discourse annotations, built upon the large-scale parallel corpus BWB introduced in Jiang et al.
We investigate the similarities and differences between the discourse structures of source and target languages.
We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures.
arXiv Detail & Related papers (2023-05-18T17:36:41Z) - A Multi-task Multi-stage Transitional Training Framework for Neural Chat
Translation [84.59697583372888]
Neural chat translation (NCT) aims to translate a cross-lingual chat between speakers of different languages.
Existing context-aware NMT models cannot achieve satisfactory performances due to limited resources of annotated bilingual dialogues.
We propose a multi-task multi-stage transitional (MMT) training framework, where an NCT model is trained using the bilingual chat translation dataset and additional monolingual dialogues.
arXiv Detail & Related papers (2023-01-27T14:41:16Z) - Controlling Extra-Textual Attributes about Dialogue Participants: A Case
Study of English-to-Polish Neural Machine Translation [4.348327991071386]
Machine translation models need to opt for a certain interpretation of textual context when translating from English to Polish.
We propose a case study where a wide range of approaches for controlling attributes in translation is employed.
The best model achieves an improvement of +5.81 chrF++/+6.03 BLEU, with other models achieving competitive performance.
arXiv Detail & Related papers (2022-05-10T08:45:39Z) - Scheduled Multi-task Learning for Neural Chat Translation [66.81525961469494]
We propose a scheduled multi-task learning framework for Neural Chat Translation (NCT)
Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training.
Extensive experiments in four language directions verify the effectiveness and superiority of the proposed approach.
arXiv Detail & Related papers (2022-05-08T02:57:28Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding.
COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z) - Towards Making the Most of Dialogue Characteristics for Neural Chat
Translation [39.995680617671184]
We propose introducing to promote the chat translation by the modeling of dialogue characteristics into the NCT model.
We optimize the NCT model through the training objectives of all these tasks.
Comprehensive experiments on four language directions verify the effectiveness and superiority of the proposed approach.
arXiv Detail & Related papers (2021-09-02T02:04:00Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.