Knowledge Distillation for Quality Estimation
- URL: http://arxiv.org/abs/2107.00411v1
- Date: Thu, 1 Jul 2021 12:36:21 GMT
- Title: Knowledge Distillation for Quality Estimation
- Authors: Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Fr\'ed\'eric
Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia
- Abstract summary: Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations.
Recent success in QE stems from the use of multilingual pre-trained representations, where very large models lead to impressive results.
We show that this approach, in combination with data augmentation, leads to light-weight QE models that perform competitively with distilled pre-trained representations with 8x fewer parameters.
- Score: 79.51452598302934
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Quality Estimation (QE) is the task of automatically predicting Machine
Translation quality in the absence of reference translations, making it
applicable in real-time settings, such as translating online social media
conversations. Recent success in QE stems from the use of multilingual
pre-trained representations, where very large models lead to impressive
results. However, the inference time, disk and memory requirements of such
models do not allow for wide usage in the real world. Models trained on
distilled pre-trained representations remain prohibitively large for many usage
scenarios. We instead propose to directly transfer knowledge from a strong QE
teacher model to a much smaller model with a different, shallower architecture.
We show that this approach, in combination with data augmentation, leads to
light-weight QE models that perform competitively with distilled pre-trained
representations with 8x fewer parameters.
Related papers
- Sparse Upcycling: Inference Inefficient Finetuning [4.988895645799531]
We show that sparse upcycling can achieve better quality, with improvements of over 20% relative to continued pretraining (CPT) in certain scenarios.
However, this comes with a significant inference cost, leading to 40% slowdowns in high-demand inference settings for larger models.
arXiv Detail & Related papers (2024-11-13T19:02:36Z) - LAR-IQA: A Lightweight, Accurate, and Robust No-Reference Image Quality Assessment Model [6.074775040047959]
We propose a compact, lightweight NR-IQA model that achieves state-of-the-art (SOTA) performance on ECCV AIM UHD-IQA challenge validation and test datasets.
Our model features a dual-branch architecture, with each branch separately trained on synthetically and authentically distorted images.
Our evaluation considering various open-source datasets highlights the practical, high-accuracy, and robust performance of our proposed lightweight model.
arXiv Detail & Related papers (2024-08-30T07:32:19Z) - Self-Supervised Speech Quality Estimation and Enhancement Using Only
Clean Speech [50.95292368372455]
We propose VQScore, a self-supervised metric for evaluating speech based on the quantization error of a vector-quantized-variational autoencoder (VQ-VAE)
The training of VQ-VAE relies on clean speech; hence, large quantization errors can be expected when the speech is distorted.
We found that the vector quantization mechanism could also be used for self-supervised speech enhancement (SE) model training.
arXiv Detail & Related papers (2024-02-26T06:01:38Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained
Models [87.7086269902562]
We show that subword-based models might still be the most practical choice in many settings.
We encourage future work in tokenizer-free methods to consider these factors when designing and evaluating new models.
arXiv Detail & Related papers (2022-10-13T15:47:09Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Classification-based Quality Estimation: Small and Efficient Models for
Real-world Applications [29.380675447523817]
Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task.
Recent QE models have achieved previously-unseen levels of correlation with human judgments.
We evaluate several model compression techniques for QE and find that, despite their popularity in other NLP tasks, they lead to poor performance in this regression setting.
arXiv Detail & Related papers (2021-09-17T16:14:52Z) - MDQE: A More Accurate Direct Pretraining for Machine Translation Quality
Estimation [4.416484585765028]
We argue that there are still gaps between the predictor and the estimator in both data quality and training objectives.
We propose a novel framework that provides a more accurate direct pretraining for QE tasks.
arXiv Detail & Related papers (2021-07-24T09:48:37Z) - REALM: Retrieval-Augmented Language Model Pre-Training [37.3178586179607]
We augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia.
For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner.
We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA)
arXiv Detail & Related papers (2020-02-10T18:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.