Lost in Translationese? Reducing Translation Effect Using Abstract
Meaning Representation
- URL: http://arxiv.org/abs/2304.11501v2
- Date: Mon, 29 Jan 2024 23:09:49 GMT
- Title: Lost in Translationese? Reducing Translation Effect Using Abstract
Meaning Representation
- Authors: Shira Wein, Nathan Schneider
- Abstract summary: We argue that Abstract Meaning Representation (AMR) can be used as an interlingua to reduce the amount of translationese in translated texts.
By parsing English translations into an AMR and then generating text from that AMR, the result more closely resembles originally English text.
This work makes strides towards reducing translationese in text and highlights the utility of AMR as an interlingua.
- Score: 11.358350306918027
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Translated texts bear several hallmarks distinct from texts originating in
the language. Though individual translated texts are often fluent and preserve
meaning, at a large scale, translated texts have statistical tendencies which
distinguish them from text originally written in the language
("translationese") and can affect model performance. We frame the novel task of
translationese reduction and hypothesize that Abstract Meaning Representation
(AMR), a graph-based semantic representation which abstracts away from the
surface form, can be used as an interlingua to reduce the amount of
translationese in translated texts. By parsing English translations into an AMR
and then generating text from that AMR, the result more closely resembles
originally English text across three quantitative macro-level measures, without
severely compromising fluency or adequacy. We compare our AMR-based approach
against three other techniques based on machine translation or paraphrase
generation. This work makes strides towards reducing translationese in text and
highlights the utility of AMR as an interlingua.
Related papers
- (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts [52.18246881218829]
We introduce a novel multi-agent framework based on large language models (LLMs) for literary translation, implemented as a company called TransAgents.
To evaluate the effectiveness of our system, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP)
arXiv Detail & Related papers (2024-05-20T05:55:08Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Lost In Translation: Generating Adversarial Examples Robust to
Round-Trip Translation [66.33340583035374]
We present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation.
We demonstrate that 6 state-of-the-art text-based adversarial attacks do not maintain their efficacy after round-trip translation.
We introduce an intervention-based solution to this problem, by integrating Machine Translation into the process of adversarial example generation.
arXiv Detail & Related papers (2023-07-24T04:29:43Z) - Do GPTs Produce Less Literal Translations? [20.095646048167612]
Large Language Models (LLMs) have emerged as general-purpose language models capable of addressing many natural language generation or understanding tasks.
We find that translations out of English (E-X) from GPTs tend to be less literal, while exhibiting similar or better scores on Machine Translation quality metrics.
arXiv Detail & Related papers (2023-05-26T10:38:31Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Retrofitting Multilingual Sentence Embeddings with Abstract Meaning
Representation [70.58243648754507]
We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR)
Compared with the original textual input, AMR is a structured semantic representation that presents the core concepts and relations in a sentence explicitly and unambiguously.
Experiment results show that retrofitting multilingual sentence embeddings with AMR leads to better state-of-the-art performance on both semantic similarity and transfer tasks.
arXiv Detail & Related papers (2022-10-18T11:37:36Z) - Towards Debiasing Translation Artifacts [15.991970288297443]
We propose a novel approach to reducing translationese by extending an established bias-removal technique.
We use the Iterative Null-space Projection (INLP) algorithm, and show by measuring classification accuracy before and after debiasing, that translationese is reduced at both sentence and word level.
To the best of our knowledge, this is the first study to debias translationese as represented in latent embedding space.
arXiv Detail & Related papers (2022-05-16T21:46:51Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.