MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by
Leveraging Unstructured Context in Neural Machine Translation
- URL: http://arxiv.org/abs/2305.15904v1
- Date: Thu, 25 May 2023 10:06:08 GMT
- Title: MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by
Leveraging Unstructured Context in Neural Machine Translation
- Authors: Sebastian Vincent and Robert Flynn and Carolina Scarton
- Abstract summary: This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text.
MTCue learns an abstract representation of context, enabling transferability across different data settings.
MTCue significantly outperforms a "tagging" baseline at translating English text.
- Score: 3.703767478524629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient utilisation of both intra- and extra-textual context remains one of
the critical gaps between machine and human translation. Existing research has
primarily focused on providing individual, well-defined types of context in
translation, such as the surrounding text or discrete external variables like
the speaker's gender. This work introduces MTCue, a novel neural machine
translation (NMT) framework that interprets all context (including discrete
variables) as text. MTCue learns an abstract representation of context,
enabling transferability across different data settings and leveraging similar
attributes in low-resource scenarios. With a focus on a dialogue domain with
access to document and metadata context, we extensively evaluate MTCue in four
language pairs in both translation directions. Our framework demonstrates
significant improvements in translation quality over a parameter-matched
non-contextual baseline, as measured by BLEU (+0.88) and Comet (+1.58).
Moreover, MTCue significantly outperforms a "tagging" baseline at translating
English text. Analysis reveals that the context encoder of MTCue learns a
representation space that organises context based on specific attributes, such
as formality, enabling effective zero-shot control. Pre-training on context
embeddings also improves MTCue's few-shot performance compared to the "tagging"
baseline. Finally, an ablation study conducted on model components and
contextual variables further supports the robustness of MTCue for context-based
NMT.
Related papers
- A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning [49.62044186504516]
In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences.
Recent studies have shown that the context encoder generates noise and makes the model robust to the choice of context.
This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context.
arXiv Detail & Related papers (2024-07-03T12:50:49Z) - Context-aware Neural Machine Translation for English-Japanese Business
Scene Dialogues [14.043741721036543]
This paper explores how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation.
We propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type.
We find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation.
arXiv Detail & Related papers (2023-11-20T18:06:03Z) - Improving Long Context Document-Level Machine Translation [51.359400776242786]
Document-level context for neural machine translation (NMT) is crucial to improve translation consistency and cohesion.
Many works have been published on the topic of document-level NMT, but most restrict the system to just local context.
We propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption.
arXiv Detail & Related papers (2023-06-08T13:28:48Z) - Discourse Centric Evaluation of Machine Translation with a Densely
Annotated Parallel Corpus [82.07304301996562]
This paper presents a new dataset with rich discourse annotations, built upon the large-scale parallel corpus BWB introduced in Jiang et al.
We investigate the similarities and differences between the discourse structures of source and target languages.
We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures.
arXiv Detail & Related papers (2023-05-18T17:36:41Z) - Reference-less Analysis of Context Specificity in Translation with
Personalised Language Models [3.527589066359829]
This work investigates what extent rich character and film annotations can be leveraged to personalise language models (LMs)
We build LMs which leverage rich contextual information to reduce perplexity by up to 6.5% compared to a non-contextual model.
Our results suggest that the degree to which professional translations in our domain are context-specific can be preserved to a better extent by a contextual machine translation model.
arXiv Detail & Related papers (2023-03-29T12:19:23Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - SMDT: Selective Memory-Augmented Neural Document Translation [53.4627288890316]
We propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of context.
We retrieve similar bilingual sentence pairs from the training corpus to augment global context.
We extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts.
arXiv Detail & Related papers (2022-01-05T14:23:30Z) - Contrastive Learning for Context-aware Neural Machine TranslationUsing
Coreference Information [14.671424999873812]
We propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences.
By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency.
In experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks.
arXiv Detail & Related papers (2021-09-13T05:18:47Z) - Modeling Bilingual Conversational Characteristics for Neural Chat
Translation [24.94474722693084]
We aim to promote the translation quality of conversational text by modeling the above properties.
We evaluate our approach on the benchmark dataset BConTrasT (English-German) and a self-collected bilingual dialogue corpus, named BMELD (English-Chinese)
Our approach notably boosts the performance over strong baselines by a large margin and significantly surpasses some state-of-the-art context-aware NMT models in terms of BLEU and TER.
arXiv Detail & Related papers (2021-07-23T12:23:34Z) - Simultaneous Machine Translation with Visual Context [42.88121241096681]
Simultaneous machine translation (SiMT) aims to translate a continuous input text stream into another language with the lowest latency and highest quality possible.
We analyse the impact of different multimodal approaches and visual features on state-of-the-art SiMT frameworks.
arXiv Detail & Related papers (2020-09-15T18:19:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.