Automatic Machine Translation Evaluation in Many Languages via Zero-Shot
Paraphrasing
- URL: http://arxiv.org/abs/2004.14564v2
- Date: Tue, 27 Oct 2020 23:54:02 GMT
- Title: Automatic Machine Translation Evaluation in Many Languages via Zero-Shot
Paraphrasing
- Authors: Brian Thompson and Matt Post
- Abstract summary: We frame the task of machine translation evaluation as one of scoring machine translation output with a sequence-to-sequence paraphraser.
We propose training the paraphraser as a multilingual NMT system, treating paraphrasing as a zero-shot translation task.
Our method is simple and intuitive, and does not require human judgements for training.
- Score: 11.564158965143418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We frame the task of machine translation evaluation as one of scoring machine
translation output with a sequence-to-sequence paraphraser, conditioned on a
human reference. We propose training the paraphraser as a multilingual NMT
system, treating paraphrasing as a zero-shot translation task (e.g., Czech to
Czech). This results in the paraphraser's output mode being centered around a
copy of the input sequence, which represents the best case scenario where the
MT system output matches a human reference. Our method is simple and intuitive,
and does not require human judgements for training. Our single model (trained
in 39 languages) outperforms or statistically ties with all prior metrics on
the WMT 2019 segment-level shared metrics task in all languages (excluding
Gujarati where the model had no training data). We also explore using our model
for the task of quality estimation as a metric--conditioning on the source
instead of the reference--and find that it significantly outperforms every
submission to the WMT 2019 shared task on quality estimation in every language
pair.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component.
We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z) - Statistical Machine Translation for Indic Languages [1.8899300124593648]
This paper canvasses about the development of bilingual Statistical Machine Translation models.
To create the system, MOSES open-source SMT toolkit is explored.
In our experiment, the quality of the translation is evaluated using standard metrics such as BLEU, METEOR, and RIBES.
arXiv Detail & Related papers (2023-01-02T06:23:12Z) - Extrinsic Evaluation of Machine Translation Metrics [78.75776477562087]
It is unclear if automatic metrics are reliable at distinguishing good translations from bad translations at the sentence level.
We evaluate the segment-level performance of the most widely used MT metrics (chrF, COMET, BERTScore, etc.) on three downstream cross-lingual tasks.
Our experiments demonstrate that all metrics exhibit negligible correlation with the extrinsic evaluation of the downstream outcomes.
arXiv Detail & Related papers (2022-12-20T14:39:58Z) - Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task [61.34108034582074]
We build our system based on the core idea of UNITE (Unified Translation Evaluation)
During the model pre-training phase, we first apply the pseudo-labeled data examples to continuously pre-train UNITE.
During the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions.
arXiv Detail & Related papers (2022-10-18T08:51:25Z) - UniTE: Unified Translation Evaluation [63.58868113074476]
UniTE is the first unified framework engaged with abilities to handle all three evaluation tasks.
We testify our framework on WMT 2019 Metrics and WMT 2020 Quality Estimation benchmarks.
arXiv Detail & Related papers (2022-04-28T08:35:26Z) - IsometricMT: Neural Machine Translation for Automatic Dubbing [9.605781943224251]
This work introduces a self-learning approach that allows a transformer model to directly learn to generate outputs that closely match the source length.
We report results on four language pairs with a publicly available benchmark based on TED Talk data.
arXiv Detail & Related papers (2021-12-16T08:03:20Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.