Findings of the WMT 2022 Shared Task on Translation Suggestion
- URL: http://arxiv.org/abs/2211.16717v1
- Date: Wed, 30 Nov 2022 03:48:36 GMT
- Title: Findings of the WMT 2022 Shared Task on Translation Suggestion
- Authors: Zhen Yang, Fandong Meng, Yingxue Zhang, Ernan Li and Jie Zhou
- Abstract summary: We report the result of the first edition of the WMT shared task on Translation Suggestion.
The task aims to provide alternatives for specific words or phrases given the entire documents generated by machine translation (MT)
It consists two sub-tasks, namely, the naive translation suggestion and translation suggestion with hints.
- Score: 63.457874930232926
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We report the result of the first edition of the WMT shared task on
Translation Suggestion (TS). The task aims to provide alternatives for specific
words or phrases given the entire documents generated by machine translation
(MT). It consists two sub-tasks, namely, the naive translation suggestion and
translation suggestion with hints. The main difference is that some hints are
provided in sub-task two, therefore, it is easier for the model to generate
more accurate suggestions. For sub-task one, we provide the corpus for the
language pairs English-German and English-Chinese. And only English-Chinese
corpus is provided for the sub-task two.
We received 92 submissions from 5 participating teams in sub-task one and 6
submissions for the sub-task 2, most of them covering all of the translation
directions. We used the automatic metric BLEU for evaluating the performance of
each submission.
Related papers
- SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection [68.858931667807]
Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine.
Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM.
Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine.
arXiv Detail & Related papers (2024-04-22T13:56:07Z) - AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness [16.896143197472114]
This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian languages.
We propose using machine translation for data augmentation to address the low-resource challenge of limited training data.
We achieve competitive results in the shared task: our system performs the best among all ranked teams in both subtask A (supervised learning) and subtask C (cross-lingual transfer)
arXiv Detail & Related papers (2024-04-01T21:21:15Z) - Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - UvA-MT's Participation in the WMT23 General Translation Shared Task [7.4336950563281174]
This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.
We show that by using one model to handle bidirectional tasks, it is possible to achieve comparable results with that of traditional bilingual translation for both directions.
arXiv Detail & Related papers (2023-10-15T20:49:31Z) - TSMind: Alibaba and Soochow University's Submission to the WMT22
Translation Suggestion Task [16.986003476984965]
This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion.
Basically, we utilize the model paradigm fine-tuning on the downstream tasks based on large-scale pre-trained models.
Considering the task's condition of limited use of training data, we follow the data augmentation strategies proposed by WeTS to boost our TS model performance.
arXiv Detail & Related papers (2022-11-16T15:43:31Z) - Effective Cross-Task Transfer Learning for Explainable Natural Language
Inference with T5 [50.574918785575655]
We compare sequential fine-tuning with a model for multi-task learning in the context of boosting performance on two tasks.
Our results show that while sequential multi-task learning can be tuned to be good at the first of two target tasks, it performs less well on the second and additionally struggles with overfitting.
arXiv Detail & Related papers (2022-10-31T13:26:08Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Zero-Shot Information Extraction as a Unified Text-to-Triple Translation [56.01830747416606]
We cast a suite of information extraction tasks into a text-to-triple translation framework.
We formalize the task as a translation between task-specific input text and output triples.
We study the zero-shot performance of this framework on open information extraction.
arXiv Detail & Related papers (2021-09-23T06:54:19Z) - BUT-FIT at SemEval-2020 Task 5: Automatic detection of counterfactual
statements with deep pre-trained language representation models [6.853018135783218]
This paper describes BUT-FIT's submission at SemEval-2020 Task 5: Modelling Causal Reasoning in Language: Detecting Counterfactuals.
The challenge focused on detecting whether a given statement contains a counterfactual.
We found RoBERTa LRM to perform the best in both subtasks.
arXiv Detail & Related papers (2020-07-28T11:16:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.