Revisiting Round-Trip Translation for Quality Estimation
- URL: http://arxiv.org/abs/2004.13937v1
- Date: Wed, 29 Apr 2020 03:20:22 GMT
- Title: Revisiting Round-Trip Translation for Quality Estimation
- Authors: Jihyung Moon, Hyunchang Cho, Eunjeong L. Park
- Abstract summary: Quality estimation (QE) is the task of automatically evaluating the quality of translations without human-translated references.
In this paper, we employ semantic embeddings to RTT-based QE.
Our method achieves the highest correlations with human judgments, compared to previous WMT 2019 quality estimation metric task submissions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Quality estimation (QE) is the task of automatically evaluating the quality
of translations without human-translated references. Calculating BLEU between
the input sentence and round-trip translation (RTT) was once considered as a
metric for QE, however, it was found to be a poor predictor of translation
quality. Recently, various pre-trained language models have made breakthroughs
in NLP tasks by providing semantically meaningful word and sentence embeddings.
In this paper, we employ semantic embeddings to RTT-based QE. Our method
achieves the highest correlations with human judgments, compared to previous
WMT 2019 quality estimation metric task submissions. While backward translation
models can be a drawback when using RTT, we observe that with semantic-level
metrics, RTT-based QE is robust to the choice of the backward translation
system. Additionally, the proposed method shows consistent performance for both
SMT and NMT forward translation systems, implying the method does not penalize
a certain type of model.
Related papers
- Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model [75.66013048128302]
In this work, we investigate the potential of employing the QE model as the reward model to predict human preferences for feedback training.
We first identify the overoptimization problem during QE-based feedback training, manifested as an increase in reward while translation quality declines.
To address the problem, we adopt a simple yet effective method that uses rules to detect the incorrect translations and assigns a penalty term to the reward scores of them.
arXiv Detail & Related papers (2024-01-23T16:07:43Z) - Unsupervised Translation Quality Estimation Exploiting Synthetic Data
and Pre-trained Multilingual Encoder [17.431776840662273]
We extensively investigate the usefulness of synthetic TQE data and pre-trained multilingual encoders in unsupervised sentence-level TQE.
Our experiment on WMT20 and WMT21 datasets revealed that this approach can outperform other unsupervised TQE methods on high- and low-resource translation directions.
arXiv Detail & Related papers (2023-11-09T03:10:42Z) - Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine
Translation Assessment [20.63045120292095]
Cross-lingual Machine Translation (MT) quality estimation plays a crucial role in evaluating translation performance.
GEMBA, the first MT quality assessment metric based on Large Language Models (LLMs), employs one-step prompting to achieve state-of-the-art (SOTA) in system-level MT quality estimation.
In this paper, we introduce Knowledge-Prompted Estor (KPE), a CoT prompting method that combines three one-step prompting techniques, including perplexity, token-level similarity, and sentence-level similarity.
arXiv Detail & Related papers (2023-06-13T01:18:32Z) - Competency-Aware Neural Machine Translation: Can Machine Translation
Know its Own Translation Quality? [61.866103154161884]
Neural machine translation (NMT) is often criticized for failures that happen without awareness.
We propose a novel competency-aware NMT by extending conventional NMT with a self-estimator.
We show that the proposed method delivers outstanding performance on quality estimation.
arXiv Detail & Related papers (2022-11-25T02:39:41Z) - Rethink about the Word-level Quality Estimation for Machine Translation
from Human Judgement [57.72846454929923]
We create a benchmark dataset, emphHJQE, where the expert translators directly annotate poorly translated words.
We propose two tag correcting strategies, namely tag refinement strategy and tree-based annotation strategy, to make the TER-based artificial QE corpus closer to emphHJQE.
The results show our proposed dataset is more consistent with human judgement and also confirm the effectiveness of the proposed tag correcting strategies.
arXiv Detail & Related papers (2022-09-13T02:37:12Z) - Measuring Uncertainty in Translation Quality Evaluation (TQE) [62.997667081978825]
This work carries out motivated research to correctly estimate the confidence intervals citeBrown_etal2001Interval depending on the sample size of the translated text.
The methodology we applied for this work is from Bernoulli Statistical Distribution Modelling (BSDM) and Monte Carlo Sampling Analysis (MCSA)
arXiv Detail & Related papers (2021-11-15T12:09:08Z) - Ensemble Fine-tuned mBERT for Translation Quality Estimation [0.0]
In this paper, we discuss our submission to the WMT 2021 QE Shared Task.
Our proposed system is an ensemble of multilingual BERT (mBERT)-based regression models.
It demonstrates comparable performance with respect to the Pearson's correlation and beats the baseline system in MAE/ RMSE for several language pairs.
arXiv Detail & Related papers (2021-09-08T20:13:06Z) - Verdi: Quality Estimation and Error Detection for Bilingual [23.485380293716272]
Verdi is a novel framework for word-level and sentence-level post-editing effort estimation for bilingual corpora.
We exploit the symmetric nature of bilingual corpora and apply model-level dual learning in the NMT predictor.
Our method beats the winner of the competition and outperforms other baseline methods by a great margin.
arXiv Detail & Related papers (2021-05-31T11:04:13Z) - Source and Target Bidirectional Knowledge Distillation for End-to-end
Speech Translation [88.78138830698173]
We focus on sequence-level knowledge distillation (SeqKD) from external text-based NMT models.
We train a bilingual E2E-ST model to predict paraphrased transcriptions as an auxiliary task with a single decoder.
arXiv Detail & Related papers (2021-04-13T19:00:51Z) - Unsupervised Quality Estimation for Neural Machine Translation [63.38918378182266]
Existing approaches require large amounts of expert annotated data, computation and time for training.
We devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required.
We achieve very good correlation with human judgments of quality, rivalling state-of-the-art supervised QE models.
arXiv Detail & Related papers (2020-05-21T12:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.