Do Context-Aware Translation Models Pay the Right Attention?
- URL: http://arxiv.org/abs/2105.06977v1
- Date: Fri, 14 May 2021 17:32:24 GMT
- Title: Do Context-Aware Translation Models Pay the Right Attention?
- Authors: Kayo Yin, Patrick Fernandes, Danish Pruthi, Aditi Chaudhary, Andr\'e
F. T. Martins, Graham Neubig
- Abstract summary: Context-aware machine translation models are designed to leverage contextual information, but often fail to do so.
In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words?
We introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations.
Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words.
- Score: 61.25804242929533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context-aware machine translation models are designed to leverage contextual
information, but often fail to do so. As a result, they inaccurately
disambiguate pronouns and polysemous words that require context for resolution.
In this paper, we ask several questions: What contexts do human translators use
to resolve ambiguous words? Are models paying large amounts of attention to the
same context? What if we explicitly train them to do so? To answer these
questions, we introduce SCAT (Supporting Context for Ambiguous Translations), a
new English-French dataset comprising supporting context words for 14K
translations that professional translators found useful for pronoun
disambiguation. Using SCAT, we perform an in-depth analysis of the context used
to disambiguate, examining positional and lexical characteristics of the
supporting words. Furthermore, we measure the degree of alignment between the
model's attention scores and the supporting context from SCAT, and apply a
guided attention strategy to encourage agreement between the two.
Related papers
- That was the last straw, we need more: Are Translation Systems Sensitive
to Disambiguating Context? [64.38544995251642]
We study semantic ambiguities that exist in the source (English in this work) itself.
We focus on idioms that are open to both literal and figurative interpretations.
We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation.
arXiv Detail & Related papers (2023-10-23T06:38:49Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Improving Long Context Document-Level Machine Translation [51.359400776242786]
Document-level context for neural machine translation (NMT) is crucial to improve translation consistency and cohesion.
Many works have been published on the topic of document-level NMT, but most restrict the system to just local context.
We propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption.
arXiv Detail & Related papers (2023-06-08T13:28:48Z) - PiC: A Phrase-in-Context Dataset for Phrase Understanding and Semantic
Search [25.801066428860242]
We propose PiC - a dataset of 28K of noun phrases accompanied by their contextual Wikipedia pages.
We find that training on our dataset improves ranking models' accuracy and remarkably pushes Question Answering (QA) models to near-human accuracy.
arXiv Detail & Related papers (2022-07-19T04:45:41Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - Picking BERT's Brain: Probing for Linguistic Dependencies in
Contextualized Embeddings Using Representational Similarity Analysis [13.016284599828232]
We investigate the degree to which a verb embedding encodes the verb's subject, a pronoun embedding encodes the pronoun's antecedent, and a full-sentence representation encodes the sentence's head word.
In all cases, we show that BERT's contextualized embeddings reflect the linguistic dependency being studied, and that BERT encodes these dependencies to a greater degree than it encodes less linguistically-salient controls.
arXiv Detail & Related papers (2020-11-24T13:19:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.