CUNI Submission in WMT22 General Task
- URL: http://arxiv.org/abs/2211.16174v1
- Date: Tue, 29 Nov 2022 13:06:09 GMT
- Title: CUNI Submission in WMT22 General Task
- Authors: Josef Jon, Martin Popel, Ond\v{r}ej Bojar
- Abstract summary: We present the CUNI-Bergamot submission for the WMT22 General translation task.
Compared to the previous work, we measure performance in terms of COMET score and named entities translation accuracy.
- Score: 1.9356648007558004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the CUNI-Bergamot submission for the WMT22 General translation
task. We compete in English$\rightarrow$Czech direction. Our submission further
explores block backtranslation techniques. Compared to the previous work, we
measure performance in terms of COMET score and named entities translation
accuracy. We evaluate performance of MBR decoding compared to traditional mixed
backtranslation training and we show a possible synergy when using both of the
techniques simultaneously. The results show that both approaches are effective
means of improving translation quality and they yield even better results when
combined.
Related papers
- Predicting Word Similarity in Context with Referential Translation Machines [0.0]
We identify the similarity between two words in English by casting the task as machine translation performance prediction (MTPP)
We use referential translation machines (RTMs) which allows a common representation of training and test sets.
RTMs can achieve the top results in Graded Word Similarity in Context (GWSC) task.
arXiv Detail & Related papers (2024-07-07T09:36:41Z) - UvA-MT's Participation in the WMT23 General Translation Shared Task [7.4336950563281174]
This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.
We show that by using one model to handle bidirectional tasks, it is possible to achieve comparable results with that of traditional bilingual translation for both directions.
arXiv Detail & Related papers (2023-10-15T20:49:31Z) - Large Language Models Are State-of-the-Art Evaluators of Translation
Quality [7.818228526742237]
GEMBA is a GPT-based metric for assessment of translation quality.
We investigate nine versions of GPT models, including ChatGPT and GPT-4.
Our method achieves state-of-the-art accuracy in both modes when compared to MQM-based human labels.
arXiv Detail & Related papers (2023-02-28T12:23:48Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Alibaba-Translate China's Submission for WMT 2022 Quality Estimation
Shared Task [80.22825549235556]
We present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE.
Specifically, our systems employ the framework of UniTE, which combined three types of input formats during training with a pre-trained language model.
Results show that our models reach 1st overall ranking in the Multilingual and English-Russian settings, and 2nd overall ranking in English-German and Chinese-English settings.
arXiv Detail & Related papers (2022-10-18T08:55:27Z) - Rethink about the Word-level Quality Estimation for Machine Translation
from Human Judgement [57.72846454929923]
We create a benchmark dataset, emphHJQE, where the expert translators directly annotate poorly translated words.
We propose two tag correcting strategies, namely tag refinement strategy and tree-based annotation strategy, to make the TER-based artificial QE corpus closer to emphHJQE.
The results show our proposed dataset is more consistent with human judgement and also confirm the effectiveness of the proposed tag correcting strategies.
arXiv Detail & Related papers (2022-09-13T02:37:12Z) - UniTE: Unified Translation Evaluation [63.58868113074476]
UniTE is the first unified framework engaged with abilities to handle all three evaluation tasks.
We testify our framework on WMT 2019 Metrics and WMT 2020 Quality Estimation benchmarks.
arXiv Detail & Related papers (2022-04-28T08:35:26Z) - The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task [14.629380601429956]
This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task.
We only participate in Task 2 (post-editing effort estimation) of the shared task, focusing on the target-side word-level quality estimation.
We demonstrate the competitiveness of our system compared to the widely adopted OpenKiwi-XLM baseline.
arXiv Detail & Related papers (2021-09-17T19:13:31Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z) - SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
Systems for the WMT20 News Translation Task [111.91077204077817]
We participated in four translation directions of three language pairs: English-Chinese, English-Polish, and German-Upper Sorbian.
Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques.
In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
arXiv Detail & Related papers (2020-10-11T00:40:05Z) - Learning to Evaluate Translation Beyond English: BLEURT Submissions to
the WMT Metrics 2020 Shared Task [30.889496911261677]
This paper describes our contribution to the WMT 2020 Metrics Shared Task.
We make several submissions based on BLEURT, a metric based on transfer learning.
We show how to combine BLEURT's predictions with those of YiSi and use alternative reference translations to enhance the performance.
arXiv Detail & Related papers (2020-10-08T23:16:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.