It is Not as Good as You Think! Evaluating Simultaneous Machine
Translation on Interpretation Data
- URL: http://arxiv.org/abs/2110.05213v1
- Date: Mon, 11 Oct 2021 12:27:07 GMT
- Title: It is Not as Good as You Think! Evaluating Simultaneous Machine
Translation on Interpretation Data
- Authors: Jinming Zhao, Philip Arthur, Gholamreza Haffari, Trevor Cohn, Ehsan
Shareghi
- Abstract summary: We argue that SiMT systems should be trained and tested on real interpretation data.
Our results highlight the difference of up-to 13.83 BLEU score when SiMT models are evaluated on translation vs interpretation data.
- Score: 58.105938143865906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing simultaneous machine translation (SiMT) systems are trained and
evaluated on offline translation corpora. We argue that SiMT systems should be
trained and tested on real interpretation data. To illustrate this argument, we
propose an interpretation test set and conduct a realistic evaluation of SiMT
trained on offline translations. Our results, on our test set along with 3
existing smaller scale language pairs, highlight the difference of up-to 13.83
BLEU score when SiMT models are evaluated on translation vs interpretation
data. In the absence of interpretation training data, we propose a
translation-to-interpretation (T2I) style transfer method which allows
converting existing offline translations into interpretation-style data,
leading to up-to 2.8 BLEU improvement. However, the evaluation gap remains
notable, calling for constructing large-scale interpretation corpora better
suited for evaluating and developing SiMT systems.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - An approach for mistranslation removal from popular dataset for Indic MT
Task [5.4755933832880865]
We propose an algorithm to remove mistranslations from the training corpus and evaluate its performance and efficiency.
Two Indic languages (ILs), namely, Hindi (HIN) and Odia (ODI) are chosen for the experiment.
The quality of the translations in the experiment is evaluated using standard metrics such as BLEU, METEOR, and RIBES.
arXiv Detail & Related papers (2024-01-12T06:37:19Z) - Rethinking Round-Trip Translation for Machine Translation Evaluation [44.83568796515321]
We report the surprising finding that round-trip translation can be used for automatic evaluation without the references.
We demonstrate the rectification is overdue as round-trip translation could benefit multiple machine translation evaluation tasks.
arXiv Detail & Related papers (2022-09-15T15:06:20Z) - Original or Translated? A Causal Analysis of the Impact of
Translationese on Machine Translation Performance [31.47795931399995]
Human-translated text displays distinct features from naturally written text in the same language.
We find that existing work on translationese neglects some important factors and the conclusions are mostly correlational but not causal.
We show that these two factors have a large causal effect on the MT performance.
arXiv Detail & Related papers (2022-05-04T19:17:55Z) - ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality
Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee.
It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z) - Translating the Unseen? Yor\`ub\'a $\rightarrow$ English MT in
Low-Resource, Morphologically-Unmarked Settings [8.006185289499049]
Translating between languages where certain features are marked morphologically in one but absent or marked contextually in the other is an important test case for machine translation.
In this work, we perform fine-grained analysis on how an SMT system compares with two NMT systems when translating bare nouns in Yorub'a into English.
arXiv Detail & Related papers (2021-03-07T01:24:09Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z) - Cross-lingual Supervision Improves Unsupervised Neural Machine
Translation [97.84871088440102]
We introduce a multilingual unsupervised NMT framework to leverage weakly supervised signals from high-resource language pairs to zero-resource translation directions.
Method significantly improves the translation quality by more than 3 BLEU score on six benchmark unsupervised translation directions.
arXiv Detail & Related papers (2020-04-07T05:46:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.