Can Transformer be Too Compositional? Analysing Idiom Processing in
Neural Machine Translation
- URL: http://arxiv.org/abs/2205.15301v1
- Date: Mon, 30 May 2022 17:59:32 GMT
- Title: Can Transformer be Too Compositional? Analysing Idiom Processing in
Neural Machine Translation
- Authors: Verna Dankers, Christopher G. Lucas, Ivan Titov
- Abstract summary: Unlike literal expressions, idioms' meanings do not directly follow from their parts.
NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations.
We investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer.
- Score: 55.52888815590317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unlike literal expressions, idioms' meanings do not directly follow from
their parts, posing a challenge for neural machine translation (NMT). NMT
models are often unable to translate idioms accurately and over-generate
compositional, literal translations. In this work, we investigate whether the
non-compositionality of idioms is reflected in the mechanics of the dominant
NMT model, Transformer, by analysing the hidden states and attention patterns
for models with English as source language and one of seven European languages
as target language. When Transformer emits a non-literal translation - i.e.
identifies the expression as idiomatic - the encoder processes idioms more
strongly as single lexical units compared to literal expressions. This
manifests in idioms' parts being grouped through attention and in reduced
interaction between idioms and their context. In the decoder's cross-attention,
figurative inputs result in reduced attention on source-side tokens. These
results suggest that Transformer's tendency to process idioms as compositional
expressions contributes to literal translations of idioms.
Related papers
- That was the last straw, we need more: Are Translation Systems Sensitive
to Disambiguating Context? [64.38544995251642]
We study semantic ambiguities that exist in the source (English in this work) itself.
We focus on idioms that are open to both literal and figurative interpretations.
We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation.
arXiv Detail & Related papers (2023-10-23T06:38:49Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Automatic Evaluation and Analysis of Idioms in Neural Machine
Translation [12.227312923011986]
We present a novel metric for measuring the frequency of literal translation errors without human involvement.
We explore the role of monolingual pretraining and find that it yields substantial targeted improvements.
We find that the randomly idiom models are more local or "myopic" as they are relatively unaffected by variations of the context.
arXiv Detail & Related papers (2022-10-10T10:30:09Z) - Modeling Target-Side Morphology in Neural Machine Translation: A
Comparison of Strategies [72.56158036639707]
Morphologically rich languages pose difficulties to machine translation.
A large amount of differently inflected word surface forms entails a larger vocabulary.
Some inflected forms of infrequent terms typically do not appear in the training corpus.
Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence.
arXiv Detail & Related papers (2022-03-25T10:13:20Z) - Semantics-aware Attention Improves Neural Machine Translation [35.32217580058933]
We propose two novel parameter-free methods for injecting semantic information into Transformers.
One such method operates on the encoder, through a Scene-Aware Self-Attention (SASA) head.
Another on the decoder, through a Scene-Aware Cross-Attention (SACrA) head.
arXiv Detail & Related papers (2021-10-13T17:58:22Z) - Verb Knowledge Injection for Multilingual Event Processing [50.27826310460763]
We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers.
We first demonstrate that injecting verb knowledge leads to performance gains in English event extraction.
We then explore the utility of verb adapters for event extraction in other languages.
arXiv Detail & Related papers (2020-12-31T03:24:34Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - Urdu-English Machine Transliteration using Neural Networks [0.0]
We present transliteration technique based on Expectation Maximization (EM) which is un-supervised and language independent.
System learns the pattern and out-of-vocabulary words from parallel corpus and there is no need to train it on transliteration corpus explicitly.
arXiv Detail & Related papers (2020-01-12T17:30:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.