That was the last straw, we need more: Are Translation Systems Sensitive
to Disambiguating Context?
- URL: http://arxiv.org/abs/2310.14610v1
- Date: Mon, 23 Oct 2023 06:38:49 GMT
- Title: That was the last straw, we need more: Are Translation Systems Sensitive
to Disambiguating Context?
- Authors: Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith
- Abstract summary: We study semantic ambiguities that exist in the source (English in this work) itself.
We focus on idioms that are open to both literal and figurative interpretations.
We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation.
- Score: 64.38544995251642
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The translation of ambiguous text presents a challenge for translation
systems, as it requires using the surrounding context to disambiguate the
intended meaning as much as possible. While prior work has studied ambiguities
that result from different grammatical features of the source and target
language, we study semantic ambiguities that exist in the source (English in
this work) itself. In particular, we focus on idioms that are open to both
literal and figurative interpretations (e.g., goose egg), and collect TIDE, a
dataset of 512 pairs of English sentences containing idioms with disambiguating
context such that one is literal (it laid a goose egg) and another is
figurative (they scored a goose egg, as in a score of zero). In experiments, we
compare MT-specific models and language models for (i) their preference when
given an ambiguous subsentence, (ii) their sensitivity to disambiguating
context, and (iii) the performance disparity between figurative and literal
source sentences. We find that current MT models consistently translate English
idioms literally, even when the context suggests a figurative interpretation.
On the other hand, LMs are far more context-aware, although there remain
disparities across target languages. Our findings underline the potential of
LMs as a strong backbone for context-aware translation.
Related papers
- Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Can Transformer be Too Compositional? Analysing Idiom Processing in
Neural Machine Translation [55.52888815590317]
Unlike literal expressions, idioms' meanings do not directly follow from their parts.
NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations.
We investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer.
arXiv Detail & Related papers (2022-05-30T17:59:32Z) - Semantically Informed Slang Interpretation [2.9097456604613745]
We propose a semantically informed slang interpretation (SSI) framework that considers jointly the contextual and semantic appropriateness of a candidate interpretation for a query slang.
We show how the same framework can be applied to enhancing machine translation of slang from English to other languages.
arXiv Detail & Related papers (2022-05-02T01:51:49Z) - It's not Rocket Science : Interpreting Figurative Language in Narratives [48.84507467131819]
We study the interpretation of two non-compositional figurative languages (idioms and similes)
Our experiments show that models based solely on pre-trained language models perform substantially worse than humans on these tasks.
We additionally propose knowledge-enhanced models, adopting human strategies for interpreting figurative language.
arXiv Detail & Related papers (2021-08-31T21:46:35Z) - Do Context-Aware Translation Models Pay the Right Attention? [61.25804242929533]
Context-aware machine translation models are designed to leverage contextual information, but often fail to do so.
In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words?
We introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations.
Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words.
arXiv Detail & Related papers (2021-05-14T17:32:24Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.