Learning to Detect Unacceptable Machine Translations for Downstream
Tasks
- URL: http://arxiv.org/abs/2005.03925v1
- Date: Fri, 8 May 2020 09:37:19 GMT
- Title: Learning to Detect Unacceptable Machine Translations for Downstream
Tasks
- Authors: Meng Zhang, Xin Jiang, Yang Liu, Qun Liu
- Abstract summary: We put machine translation in a cross-lingual pipeline and introduce downstream tasks to define task-specific acceptability of machine translations.
This allows us to leverage parallel data to automatically generate acceptability annotations on a large scale.
We conduct experiments to demonstrate the effectiveness of our framework for a range of downstream tasks and translation models.
- Score: 33.07594909221625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The field of machine translation has progressed tremendously in recent years.
Even though the translation quality has improved significantly, current systems
are still unable to produce uniformly acceptable machine translations for the
variety of possible use cases. In this work, we put machine translation in a
cross-lingual pipeline and introduce downstream tasks to define task-specific
acceptability of machine translations. This allows us to leverage parallel data
to automatically generate acceptability annotations on a large scale, which in
turn help to learn acceptability detectors for the downstream tasks. We conduct
experiments to demonstrate the effectiveness of our framework for a range of
downstream tasks and translation models.
Related papers
- Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Quality Estimation of Machine Translated Texts based on Direct Evidence
from Training Data [0.0]
We show that the parallel corpus used as training data for training the MT system holds direct clues for estimating the quality of translations produced by the MT system.
Our experiments show that this simple and direct method holds promise for quality estimation of translations produced by any purely data driven machine translation system.
arXiv Detail & Related papers (2023-06-27T11:52:28Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Translate your gibberish: black-box adversarial attack on machine
translation systems [0.0]
We present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa.
We show that many online translation tools, such as Google, DeepL, and Yandex, may both produce wrong or offensive translations for nonsensical adversarial input queries.
This vulnerability may interfere with understanding a new language and simply worsen the user's experience while using machine translation systems.
arXiv Detail & Related papers (2023-03-20T09:52:52Z) - Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level.
We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks.
Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Using Machine Translation to Localize Task Oriented NLG Output [5.770385426429663]
This paper explores doing this by applying machine translation to the English output.
The required quality bar is close to perfection, the range of sentences is extremely narrow, and the sentences are often very different from the ones in the machine translation training data.
We are able to reach the required quality bar by building on existing ideas and adding new ones.
arXiv Detail & Related papers (2021-07-09T15:56:45Z) - Computer Assisted Translation with Neural Quality Estimation and
Automatic Post-Editing [18.192546537421673]
We propose an end-to-end deep learning framework of the quality estimation and automatic post-editing of the machine translation output.
Our goal is to provide error correction suggestions and to further relieve the burden of human translators through an interpretable model.
arXiv Detail & Related papers (2020-09-19T00:29:00Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.