Levenshtein Training for Word-level Quality Estimation
- URL: http://arxiv.org/abs/2109.05611v2
- Date: Wed, 15 Sep 2021 21:10:09 GMT
- Title: Levenshtein Training for Word-level Quality Estimation
- Authors: Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Philipp Koehn
- Abstract summary: Levenshtein Transformer is a natural fit for the word-level QE task.
A Levenshtein Transformer can learn to post-edit without explicit supervision.
- Score: 15.119782800097711
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel scheme to use the Levenshtein Transformer to perform the
task of word-level quality estimation. A Levenshtein Transformer is a natural
fit for this task: trained to perform decoding in an iterative manner, a
Levenshtein Transformer can learn to post-edit without explicit supervision. To
further minimize the mismatch between the translation task and the word-level
QE task, we propose a two-stage transfer learning procedure on both augmented
data and human post-editing data. We also propose heuristics to construct
reference labels that are compatible with subword-level finetuning and
inference. Results on WMT 2020 QE shared task dataset show that our proposed
method has superior data efficiency under the data-constrained setting and
competitive performance under the unconstrained setting.
Related papers
- AAdaM at SemEval-2024 Task 1: Augmentation and Adaptation for Multilingual Semantic Textual Relatedness [16.896143197472114]
This paper presents our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian languages.
We propose using machine translation for data augmentation to address the low-resource challenge of limited training data.
We achieve competitive results in the shared task: our system performs the best among all ranked teams in both subtask A (supervised learning) and subtask C (cross-lingual transfer)
arXiv Detail & Related papers (2024-04-01T21:21:15Z) - Unify word-level and span-level tasks: NJUNLP's Participation for the
WMT2023 Quality Estimation Shared Task [59.46906545506715]
We introduce the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task.
Our team submitted predictions for the English-German language pair on all two sub-tasks.
Our models achieved the best results in English-German for both word-level and fine-grained error span detection sub-tasks.
arXiv Detail & Related papers (2023-09-23T01:52:14Z) - Curricular Transfer Learning for Sentence Encoded Tasks [0.0]
This article proposes a sequence of pre-training steps guided by "data hacking" and grammar analysis.
In our experiments, we acquire a considerable improvement from our method compared to other known pre-training approaches for the MultiWoZ task.
arXiv Detail & Related papers (2023-08-03T16:18:19Z) - Reducing Sequence Length by Predicting Edit Operations with Large
Language Models [50.66922361766939]
This paper proposes predicting edit spans for the source text for local sequence transduction tasks.
We apply instruction tuning for Large Language Models on the supervision data of edit spans.
Experiments show that the proposed method achieves comparable performance to the baseline in four tasks.
arXiv Detail & Related papers (2023-05-19T17:51:05Z) - Paragraph-based Transformer Pre-training for Multi-Sentence Inference [99.59693674455582]
We show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.
We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences.
arXiv Detail & Related papers (2022-05-02T21:41:14Z) - Towards a Unified Foundation Model: Jointly Pre-Training Transformers on
Unpaired Images and Text [93.11954811297652]
We design a unified transformer consisting of modality-specific tokenizers, a shared transformer encoder, and task-specific output heads.
We employ the separately-trained BERT and ViT models as teachers and apply knowledge distillation to provide additional, accurate supervision signals.
Experiments show that the resultant unified foundation transformer works surprisingly well on both the vision-only and text-only tasks.
arXiv Detail & Related papers (2021-12-14T00:20:55Z) - Back-Translated Task Adaptive Pretraining: Improving Accuracy and
Robustness on Text Classification [5.420446976940825]
We propose a back-translated task-adaptive pretraining (BT-TAPT) method that increases the amount of task-specific data for LM re-pretraining.
The experimental results show that the proposed BT-TAPT yields improved classification accuracy on both low- and high-resource data and better robustness to noise than the conventional adaptive pretraining method.
arXiv Detail & Related papers (2021-07-22T06:27:35Z) - Verdi: Quality Estimation and Error Detection for Bilingual [23.485380293716272]
Verdi is a novel framework for word-level and sentence-level post-editing effort estimation for bilingual corpora.
We exploit the symmetric nature of bilingual corpora and apply model-level dual learning in the NMT predictor.
Our method beats the winner of the competition and outperforms other baseline methods by a great margin.
arXiv Detail & Related papers (2021-05-31T11:04:13Z) - Zero-shot Learning by Generating Task-specific Adapters [38.452434222367515]
We introduce Hypter, a framework that improves zero-shot transferability by training a hypernetwork to generate task-specific adapters from task descriptions.
This formulation enables learning at task level, and greatly reduces the number of parameters by using light-weight adapters.
arXiv Detail & Related papers (2021-01-02T10:50:23Z) - Contextual Text Style Transfer [73.66285813595616]
Contextual Text Style Transfer aims to translate a sentence into a desired style with its surrounding context taken into account.
We propose a Context-Aware Style Transfer (CAST) model, which uses two separate encoders for each input sentence and its surrounding context.
Two new benchmarks, Enron-Context and Reddit-Context, are introduced for formality and offensiveness style transfer.
arXiv Detail & Related papers (2020-04-30T23:01:12Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.