UniTE: Unified Translation Evaluation
- URL: http://arxiv.org/abs/2204.13346v1
- Date: Thu, 28 Apr 2022 08:35:26 GMT
- Title: UniTE: Unified Translation Evaluation
- Authors: Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F.
Wong, Lidia S. Chao
- Abstract summary: UniTE is the first unified framework engaged with abilities to handle all three evaluation tasks.
We testify our framework on WMT 2019 Metrics and WMT 2020 Quality Estimation benchmarks.
- Score: 63.58868113074476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Translation quality evaluation plays a crucial role in machine translation.
According to the input format, it is mainly separated into three tasks, i.e.,
reference-only, source-only and source-reference-combined. Recent methods,
despite their promising results, are specifically designed and optimized on one
of them. This limits the convenience of these methods, and overlooks the
commonalities among tasks. In this paper, we propose UniTE, which is the first
unified framework engaged with abilities to handle all three evaluation tasks.
Concretely, we propose monotonic regional attention to control the interaction
among input segments, and unified pretraining to better adapt multi-task
learning. We testify our framework on WMT 2019 Metrics and WMT 2020 Quality
Estimation benchmarks. Extensive analyses show that our \textit{single model}
can universally surpass various state-of-the-art or winner methods across
tasks. Both source code and associated models are available at
https://github.com/NLP2CT/UniTE.
Related papers
- Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Alibaba-Translate China's Submission for WMT 2022 Quality Estimation
Shared Task [80.22825549235556]
We present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE.
Specifically, our systems employ the framework of UniTE, which combined three types of input formats during training with a pre-trained language model.
Results show that our models reach 1st overall ranking in the Multilingual and English-Russian settings, and 2nd overall ranking in English-German and Chinese-English settings.
arXiv Detail & Related papers (2022-10-18T08:55:27Z) - Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task [61.34108034582074]
We build our system based on the core idea of UNITE (Unified Translation Evaluation)
During the model pre-training phase, we first apply the pseudo-labeled data examples to continuously pre-train UNITE.
During the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions.
arXiv Detail & Related papers (2022-10-18T08:51:25Z) - Towards a Unified Multi-Dimensional Evaluator for Text Generation [101.47008809623202]
We propose a unified multi-dimensional evaluator UniEval for Natural Language Generation (NLG)
We re-frame NLG evaluation as a Boolean Question Answering (QA) task, and by guiding the model with different questions, we can use one evaluator to evaluate from multiple dimensions.
Experiments on three typical NLG tasks show that UniEval correlates substantially better with human judgments than existing metrics.
arXiv Detail & Related papers (2022-10-13T17:17:03Z) - FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation [64.9546787488337]
We present FRMT, a new dataset and evaluation benchmark for Few-shot Region-aware Machine Translation.
The dataset consists of professional translations from English into two regional variants each of Portuguese and Mandarin Chinese.
arXiv Detail & Related papers (2022-10-01T05:02:04Z) - Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment
Analysis [25.482853330324748]
Multimodal Aspect-Based Sentiment Analysis (MABSA) has attracted increasing attention in recent years.
Previous approaches either (i) use separately pre-trained visual and textual models, which ignore the crossmodal alignment or (ii) use vision-grained models pre-trained with general pre-training tasks.
We propose a task-specific Vision-Language Pre-training framework for MABSA (MABSA), which is a unified multimodal encoder-decoder architecture for all the pretraining and downstream tasks.
arXiv Detail & Related papers (2022-04-17T08:44:00Z) - Automatic Machine Translation Evaluation in Many Languages via Zero-Shot
Paraphrasing [11.564158965143418]
We frame the task of machine translation evaluation as one of scoring machine translation output with a sequence-to-sequence paraphraser.
We propose training the paraphraser as a multilingual NMT system, treating paraphrasing as a zero-shot translation task.
Our method is simple and intuitive, and does not require human judgements for training.
arXiv Detail & Related papers (2020-04-30T03:32:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.