CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared
Task
- URL: http://arxiv.org/abs/2209.06243v1
- Date: Tue, 13 Sep 2022 18:05:12 GMT
- Title: CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared
Task
- Authors: Ricardo Rei, Marcos Treviso, Nuno M. Guerreiro, Chrysoula Zerva, Ana
C. Farinha, Christine Maroti, Jos\'e G. C. de Souza, Taisiya Glushkova,
Duarte M. Alves, Alon Lavie, Luisa Coheur, Andr\'e F. T. Martins
- Abstract summary: We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE)
Our team participated on all three subtasks: (i) Sentence and Word-level Quality Prediction; (ii) Explainable QE; and (iii) Critical Error Detection.
- Score: 11.716878242203267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present the joint contribution of IST and Unbabel to the WMT 2022 Shared
Task on Quality Estimation (QE). Our team participated on all three subtasks:
(i) Sentence and Word-level Quality Prediction; (ii) Explainable QE; and (iii)
Critical Error Detection. For all tasks we build on top of the COMET framework,
connecting it with the predictor-estimator architecture of OpenKiwi, and
equipping it with a word-level sequence tagger and an explanation extractor.
Our results suggest that incorporating references during pretraining improves
performance across several language pairs on downstream tasks, and that jointly
training with sentence and word-level objectives yields a further boost.
Furthermore, combining attention and gradient information proved to be the top
strategy for extracting good explanations of sentence-level QE models. Overall,
our submissions achieved the best results for all three tasks for almost all
language pairs by a considerable margin.
Related papers
- Unify word-level and span-level tasks: NJUNLP's Participation for the
WMT2023 Quality Estimation Shared Task [59.46906545506715]
We introduce the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task.
Our team submitted predictions for the English-German language pair on all two sub-tasks.
Our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks.
arXiv Detail & Related papers (2023-09-23T01:52:14Z) - Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality
Estimation Shared Task [11.681598828340912]
We present the joint contribution of Unbabel and Instituto Superior T'ecnico to the WMT 2023 Shared Task on Quality Estimation (QE)
Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2)
Our multilingual approaches are ranked first for all tasks, reaching state-of-the-art performance for quality estimation at word-, span- and sentence-level judgements.
arXiv Detail & Related papers (2023-09-21T09:38:56Z) - Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z) - Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language
Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks.
This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z) - niksss at HinglishEval: Language-agnostic BERT-based Contextual
Embeddings with Catboost for Quality Evaluation of the Low-Resource
Synthetically Generated Code-Mixed Hinglish Text [0.0]
This paper describes the system description for the HinglishEval challenge at INLG 2022.
The goal of this task was to investigate the factors influencing the quality of the code-mixed text generation system.
arXiv Detail & Related papers (2022-06-17T17:36:03Z) - QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse.
We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z) - Ensemble Fine-tuned mBERT for Translation Quality Estimation [0.0]
In this paper, we discuss our submission to the WMT 2021 QE Shared Task.
Our proposed system is an ensemble of multilingual BERT (mBERT)-based regression models.
It demonstrates comparable performance with respect to the Pearson's correlation and beats the baseline system in MAE/ RMSE for several language pairs.
arXiv Detail & Related papers (2021-09-08T20:13:06Z) - Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on
Spoken Language Understanding [101.24748444126982]
Decomposable tasks are complex and comprise of a hierarchy of sub-tasks.
Existing benchmarks, however, typically hold out examples for only the surface-level sub-task.
We propose a framework to construct robust test sets using coordinate ascent over sub-task specific utility functions.
arXiv Detail & Related papers (2021-06-29T02:53:59Z) - Question Answering Infused Pre-training of General-Purpose
Contextualized Representations [70.62967781515127]
We propose a pre-training objective based on question answering (QA) for learning general-purpose contextual representations.
We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model.
We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection.
arXiv Detail & Related papers (2021-06-15T14:45:15Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.