Better Uncertainty Quantification for Machine Translation Evaluation
- URL: http://arxiv.org/abs/2204.06546v1
- Date: Wed, 13 Apr 2022 17:49:25 GMT
- Title: Better Uncertainty Quantification for Machine Translation Evaluation
- Authors: Chrysoula Zerva, Taisiya Glushkova, Ricardo Rei, Andr\'e F. T. Martins
- Abstract summary: We train the COMET metric with new heteroscedastic regression, divergence minimization, and direct uncertainty prediction objectives.
Experiments show improved results on WMT20 and WMT21 metrics task datasets and a substantial reduction in computational costs.
- Score: 17.36759906285316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural-based machine translation (MT) evaluation metrics are progressing
fast. However, these systems are often hard to interpret and might produce
unreliable scores when human references or assessments are noisy or when data
is out-of-domain. Recent work leveraged uncertainty quantification techniques
such as Monte Carlo dropout and deep ensembles to provide confidence intervals,
but these techniques (as we show) are limited in several ways. In this paper we
investigate more powerful and efficient uncertainty predictors for MT
evaluation metrics and their potential to capture aleatoric and epistemic
uncertainty. To this end we train the COMET metric with new heteroscedastic
regression, divergence minimization, and direct uncertainty prediction
objectives. Our experiments show improved results on WMT20 and WMT21 metrics
task datasets and a substantial reduction in computational costs. Moreover,
they demonstrate the ability of our predictors to identify low quality
references and to reveal model uncertainty due to out-of-domain data.
Related papers
- The Probabilistic Tsetlin Machine: A Novel Approach to Uncertainty Quantification [1.0499611180329802]
This paper introduces the Probabilistic Tsetlin Machine (PTM) framework, aimed at providing a robust, reliable, and interpretable approach for uncertainty quantification.
Unlike the original TM, the PTM learns the probability of staying on each state of each Tsetlin Automaton (TA) across all clauses.
During inference, TAs decide their actions by sampling states based on learned probability distributions.
arXiv Detail & Related papers (2024-10-23T13:20:42Z) - Error-Driven Uncertainty Aware Training [7.702016079410588]
Error-Driven Uncertainty Aware Training aims to enhance the ability of neural classifiers to estimate their uncertainty correctly.
The EUAT approach operates during the model's training phase by selectively employing two loss functions depending on whether the training examples are correctly or incorrectly predicted.
We evaluate EUAT using diverse neural models and datasets in the image recognition domains considering both non-adversarial and adversarial settings.
arXiv Detail & Related papers (2024-05-02T11:48:14Z) - BLEURT Has Universal Translations: An Analysis of Automatic Metrics by
Minimum Risk Training [64.37683359609308]
In this study, we analyze various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems.
We find that certain metrics exhibit robustness defects, such as the presence of universal adversarial translations in BLEURT and BARTScore.
In-depth analysis suggests two main causes of these robustness deficits: distribution biases in the training datasets, and the tendency of the metric paradigm.
arXiv Detail & Related papers (2023-07-06T16:59:30Z) - Lightweight, Uncertainty-Aware Conformalized Visual Odometry [2.429910016019183]
Data-driven visual odometry (VO) is a critical subroutine for autonomous edge robotics.
Emerging edge robotics devices like insect-scale drones and surgical robots lack a computationally efficient framework to estimate VO's predictive uncertainties.
This paper presents a novel, lightweight, and statistically robust framework that leverages conformal inference (CI) to extract VO's uncertainty bands.
arXiv Detail & Related papers (2023-03-03T20:37:55Z) - ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference [54.17205151960878]
We introduce a sampling-free approach that is generic and easy to deploy.
We produce reliable uncertainty estimates on par with state-of-the-art methods at a significantly lower computational cost.
arXiv Detail & Related papers (2022-11-21T13:23:09Z) - Monotonicity and Double Descent in Uncertainty Estimation with Gaussian
Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.
We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension.
We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z) - Identifying Weaknesses in Machine Translation Metrics Through Minimum
Bayes Risk Decoding: A Case Study for COMET [42.77140426679383]
We show that sample-based Minimum Bayes Risk decoding can be used to explore and quantify such weaknesses.
We further show that these biases cannot be fully removed by simply training on additional synthetic data.
arXiv Detail & Related papers (2022-02-10T17:07:32Z) - Uncertainty-Aware Machine Translation Evaluation [0.716879432974126]
We introduce uncertainty-aware MT evaluation and analyze the trustworthiness of the predicted quality.
We compare the performance of our uncertainty-aware MT evaluation methods across multiple language pairs from the QT21 dataset and the WMT20 metrics task.
arXiv Detail & Related papers (2021-09-13T22:46:03Z) - On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data.
It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Learning to Predict Error for MRI Reconstruction [67.76632988696943]
We demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error.
We propose a novel method that estimates the target labels and magnitude of the prediction error in two steps.
arXiv Detail & Related papers (2020-02-13T15:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.