As Little as Possible, as Much as Necessary: Detecting Over- and
Undertranslations with Contrastive Conditioning
- URL: http://arxiv.org/abs/2203.01927v1
- Date: Thu, 3 Mar 2022 18:59:02 GMT
- Title: As Little as Possible, as Much as Necessary: Detecting Over- and
Undertranslations with Contrastive Conditioning
- Authors: Jannis Vamvas and Rico Sennrich
- Abstract summary: We propose a method for detecting superfluous words in neural machine translation.
We compare the likelihood of a full sequence under a translation model to the likelihood of its parts, given the corresponding source or target sequence.
This allows to pinpoint superfluous words in the translation and untranslated words in the source even in the absence of a reference translation.
- Score: 42.46681912294797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Omission and addition of content is a typical issue in neural machine
translation. We propose a method for detecting such phenomena with
off-the-shelf translation models. Using contrastive conditioning, we compare
the likelihood of a full sequence under a translation model to the likelihood
of its parts, given the corresponding source or target sequence. This allows to
pinpoint superfluous words in the translation and untranslated words in the
source even in the absence of a reference translation. The accuracy of our
method is comparable to a supervised method that requires a custom quality
estimation model.
Related papers
- Translating away Translationese without Parallel Data [14.423809260672877]
Translated texts exhibit systematic linguistic differences compared to original texts in the same language.
In this paper, we explore a novel approach to reduce translationese in translated texts: translation-based style transfer.
We show how we can eliminate the need for parallel validation data by combining the self-supervised loss with an unsupervised loss.
arXiv Detail & Related papers (2023-10-28T22:11:25Z) - Towards Fine-Grained Information: Identifying the Type and Location of
Translation Errors [80.22825549235556]
Existing approaches can not synchronously consider error position and type.
We build an FG-TED model to predict the textbf addition and textbfomission errors.
Experiments show that our model can identify both error type and position concurrently, and gives state-of-the-art results.
arXiv Detail & Related papers (2023-02-17T16:20:33Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Principled Paraphrase Generation with Parallel Corpora [52.78059089341062]
We formalize the implicit similarity function induced by round-trip Machine Translation.
We show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation.
We design an alternative similarity metric that mitigates this issue.
arXiv Detail & Related papers (2022-05-24T17:22:42Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - End-to-End Lexically Constrained Machine Translation for Morphologically
Rich Languages [0.0]
We investigate mechanisms to allow neural machine translation to infer the correct word inflection given lemmatized constraints.
Our experiments on the English-Czech language pair show that this approach improves the translation of constrained terms in both automatic and manual evaluation.
arXiv Detail & Related papers (2021-06-23T13:40:13Z) - Lexically Cohesive Neural Machine Translation with Copy Mechanism [21.43163704217968]
We employ a copy mechanism into a context-aware neural machine translation model to allow copying words from previous outputs.
We conduct experiments on Japanese to English translation using an evaluation dataset for discourse translation.
arXiv Detail & Related papers (2020-10-11T08:39:02Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.