Related papers: Automatic Evaluation and Analysis of Idioms in Neural Machine Translation

Automatic Evaluation and Analysis of Idioms in Neural Machine Translation

URL: http://arxiv.org/abs/2210.04545v1
Date: Mon, 10 Oct 2022 10:30:09 GMT
Title: Automatic Evaluation and Analysis of Idioms in Neural Machine Translation
Authors: Christos Baziotis, Prashant Mathur, Eva Hasler
Abstract summary: We present a novel metric for measuring the frequency of literal translation errors without human involvement. We explore the role of monolingual pretraining and find that it yields substantial targeted improvements. We find that the randomly idiom models are more local or "myopic" as they are relatively unaffected by variations of the context.
Score: 12.227312923011986
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A major open problem in neural machine translation (NMT) is the translation of idiomatic expressions, such as "under the weather". The meaning of these expressions is not composed by the meaning of their constituent words, and NMT models tend to translate them literally (i.e., word-by-word), which leads to confusing and nonsensical translations. Research on idioms in NMT is limited and obstructed by the absence of automatic methods for quantifying these errors. In this work, first, we propose a novel metric for automatically measuring the frequency of literal translation errors without human involvement. Equipped with this metric, we present controlled translation experiments with models trained in different conditions (with/without the test-set idioms) and across a wide range of (global and targeted) metrics and test sets. We explore the role of monolingual pretraining and find that it yields substantial targeted improvements, even without observing any translation examples of the test-set idioms. In our analysis, we probe the role of idiom context. We find that the randomly initialized models are more local or "myopic" as they are relatively unaffected by variations of the idiom context, unlike the pretrained ones.

Related papers

Crossing the Threshold: Idiomatic Machine Translation through Retrieval Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues. We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations. To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z)
Towards Effective Disambiguation for Machine Translation with Large Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences" Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z)
Do GPTs Produce Less Literal Translations? [20.095646048167612]
Large Language Models (LLMs) have emerged as general-purpose language models capable of addressing many natural language generation or understanding tasks. We find that translations out of English (E-X) from GPTs tend to be less literal, while exhibiting similar or better scores on Machine Translation quality metrics.
arXiv Detail & Related papers (2023-05-26T10:38:31Z)
Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level. We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks. Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z)
Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation [55.52888815590317]
Unlike literal expressions, idioms' meanings do not directly follow from their parts. NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations. We investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer.
arXiv Detail & Related papers (2022-05-30T17:59:32Z)
When Does Translation Require Context? A Data-driven, Multilingual Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT) Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation. We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z)
Investigating Failures of Automatic Translation in the Case of Unambiguous Gender [13.58884863186619]
Transformer based models are the modern work horses for neural machine translation (NMT) We observe a systemic and rudimentary class of errors made by transformer based models with regards to translating from a language that doesn't mark gender on nouns into others that do. We release an evaluation scheme and dataset for measuring the ability of transformer based NMT models to translate gender correctly.
arXiv Detail & Related papers (2021-04-16T00:57:36Z)
It's not a Non-Issue: Negation as a Source of Error in Machine Translation [33.991817055535854]
We investigate whether translating negation is an issue for modern machine translation systems using 17 translation directions as test bed. We find that indeed the presence of negation can significantly impact downstream quality, in some cases resulting in quality reductions of more than 60%.
arXiv Detail & Related papers (2020-10-12T03:34:44Z)
It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty. XMI exploits the probabilistic nature of most neural machine translation models. We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.