Improving Word Sense Disambiguation in Neural Machine Translation with
Salient Document Context
- URL: http://arxiv.org/abs/2311.15507v1
- Date: Mon, 27 Nov 2023 03:05:48 GMT
- Title: Improving Word Sense Disambiguation in Neural Machine Translation with
Salient Document Context
- Authors: Elijah Rippeth, Marine Carpuat, Kevin Duh, Matt Post
- Abstract summary: Lexical ambiguity is a challenging and pervasive problem in machine translation (mt)
We introduce a simple and scalable approach to resolve translation ambiguity by incorporating a small amount of extra-sentential context in neural mt.
Our method translates ambiguous source words better than strong sentence-level baselines and comparable document-level baselines.
- Score: 30.461643690171258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lexical ambiguity is a challenging and pervasive problem in machine
translation (\mt). We introduce a simple and scalable approach to resolve
translation ambiguity by incorporating a small amount of extra-sentential
context in neural \mt. Our approach requires no sense annotation and no change
to standard model architectures. Since actual document context is not available
for the vast majority of \mt training data, we collect related sentences for
each input to construct pseudo-documents. Salient words from pseudo-documents
are then encoded as a prefix to each source sentence to condition the
generation of the translation. To evaluate, we release \docmucow, a challenge
set for translation disambiguation based on the English-German \mucow
\cite{raganato-etal-2020-evaluation} augmented with document IDs. Extensive
experiments show that our method translates ambiguous source words better than
strong sentence-level baselines and comparable document-level baselines while
reducing training costs.
Related papers
- Recovering document annotations for sentence-level bitext [18.862295675088056]
We reconstruct document-level information for three datasets in German, French, Spanish, Italian, Polish, and Portuguese.
We introduce a document-level filtering technique as an alternative to traditional bitext filtering.
Last we train models on these longer contexts and demonstrate improvement in document-level translation without degradation of sentence-level translation.
arXiv Detail & Related papers (2024-06-06T08:58:14Z) - An Analysis of BPE Vocabulary Trimming in Neural Machine Translation [56.383793805299234]
vocabulary trimming is a postprocessing step that replaces rare subwords with their component subwords.
We show that vocabulary trimming fails to improve performance and is even prone to incurring heavy degradation.
arXiv Detail & Related papers (2024-03-30T15:29:49Z) - Improving Long Context Document-Level Machine Translation [51.359400776242786]
Document-level context for neural machine translation (NMT) is crucial to improve translation consistency and cohesion.
Many works have been published on the topic of document-level NMT, but most restrict the system to just local context.
We propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption.
arXiv Detail & Related papers (2023-06-08T13:28:48Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - Escaping the sentence-level paradigm in machine translation [9.676755606927435]
Much work in document-context machine translation exists, but for various reasons has been unable to catch hold.
In contrast to work on specialized architectures, we show that the standard Transformer architecture is sufficient.
We propose generative variants of existing contrastive metrics that are better able to discriminate among document systems.
arXiv Detail & Related papers (2023-04-25T16:09:02Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Rethink about the Word-level Quality Estimation for Machine Translation
from Human Judgement [57.72846454929923]
We create a benchmark dataset, emphHJQE, where the expert translators directly annotate poorly translated words.
We propose two tag correcting strategies, namely tag refinement strategy and tree-based annotation strategy, to make the TER-based artificial QE corpus closer to emphHJQE.
The results show our proposed dataset is more consistent with human judgement and also confirm the effectiveness of the proposed tag correcting strategies.
arXiv Detail & Related papers (2022-09-13T02:37:12Z) - Word-level Human Interpretable Scoring Mechanism for Novel Text
Detection Using Tsetlin Machines [16.457778420360537]
We propose a Tsetlin machine architecture for scoring individual words according to their contribution to novelty.
Our approach encodes a description of the novel documents using the linguistic patterns captured by TM clauses.
We then adopt this description to measure how much a word contributes to making documents novel.
arXiv Detail & Related papers (2021-05-10T23:41:14Z) - Document Graph for Neural Machine Translation [42.13593962963306]
We show that a document can be represented as a graph that connects relevant contexts regardless of their distances.
Experiments on various NMT benchmarks, including IWSLT English-French, Chinese-English, WMT English-German and Opensubtitle English-Russian, demonstrate that using document graphs can significantly improve the translation quality.
arXiv Detail & Related papers (2020-12-07T06:48:59Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.