Evaluating Machine Translation Quality with Conformal Predictive
Distributions
- URL: http://arxiv.org/abs/2306.01549v1
- Date: Fri, 2 Jun 2023 13:56:30 GMT
- Title: Evaluating Machine Translation Quality with Conformal Predictive
Distributions
- Authors: Patrizio Giovannotti
- Abstract summary: We present a new approach for assessing uncertainty in machine translation by simultaneously evaluating translation quality and providing a reliable confidence score.
Our method outperforms a simple, but effective baseline on six different language pairs in terms of coverage and sharpness.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a new approach for assessing uncertainty in machine
translation by simultaneously evaluating translation quality and providing a
reliable confidence score. Our approach utilizes conformal predictive
distributions to produce prediction intervals with guaranteed coverage, meaning
that for any given significance level $\epsilon$, we can expect the true
quality score of a translation to fall out of the interval at a rate of
$1-\epsilon$. In this paper, we demonstrate how our method outperforms a
simple, but effective baseline on six different language pairs in terms of
coverage and sharpness. Furthermore, we validate that our approach requires the
data exchangeability assumption to hold for optimal performance.
Related papers
- Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - The Penalized Inverse Probability Measure for Conformal Classification [0.5172964916120902]
The work introduces the Penalized Inverse Probability (PIP) nonconformity score, and its regularized version RePIP, that allow the joint optimization of both efficiency and informativeness.
The work shows how PIP-based conformal classifiers exhibit precisely the desired behavior in comparison with other nonconformity measures and strike a good balance between informativeness and efficiency.
arXiv Detail & Related papers (2024-06-13T07:37:16Z) - Robust Conformal Prediction Using Privileged Information [17.886554223172517]
We develop a method to generate prediction sets with a guaranteed coverage rate that is robust to corruptions in the training data.
Our approach builds on conformal prediction, a powerful framework to construct prediction sets that are valid under the i.i.d assumption.
arXiv Detail & Related papers (2024-06-08T08:56:47Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a method called Stratified Prediction-Powered Inference (StratPPI)
We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Equal Opportunity of Coverage in Fair Regression [50.76908018786335]
We study fair machine learning (ML) under predictive uncertainty to enable reliable and trustworthy decision-making.
We propose Equal Opportunity of Coverage (EOC) that aims to achieve two properties: (1) coverage rates for different groups with similar outcomes are close, and (2) the coverage rate for the entire population remains at a predetermined level.
arXiv Detail & Related papers (2023-11-03T21:19:59Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - Conformalizing Machine Translation Evaluation [9.89901717499058]
Several uncertainty estimation methods have been recently proposed for machine translation evaluation.
We show that the majority of them tend to underestimate model uncertainty, and as a result they often produce misleading confidence intervals that do not cover the ground truth.
We propose as an alternative the use of conformal prediction, a distribution-free method to obtain confidence intervals with a theoretically established guarantee on coverage.
arXiv Detail & Related papers (2023-06-09T19:36:18Z) - Conformal Prediction for Federated Uncertainty Quantification Under
Label Shift [57.54977668978613]
Federated Learning (FL) is a machine learning framework where many clients collaboratively train models.
We develop a new conformal prediction method based on quantile regression and take into account privacy constraints.
arXiv Detail & Related papers (2023-06-08T11:54:58Z) - Predictive Inference with Weak Supervision [3.1925030748447747]
We bridge the gap between partial supervision and validation by developing a conformal prediction framework.
We introduce a new notion of coverage and predictive validity, then develop several application scenarios.
We corroborate the hypothesis that the new coverage definition allows for tighter and more informative (but valid) confidence sets.
arXiv Detail & Related papers (2022-01-20T17:26:52Z) - Measuring Uncertainty in Translation Quality Evaluation (TQE) [62.997667081978825]
This work carries out motivated research to correctly estimate the confidence intervals citeBrown_etal2001Interval depending on the sample size of the translated text.
The methodology we applied for this work is from Bernoulli Statistical Distribution Modelling (BSDM) and Monte Carlo Sampling Analysis (MCSA)
arXiv Detail & Related papers (2021-11-15T12:09:08Z) - Robust Validation: Confident Predictions Even When Distributions Shift [19.327409270934474]
We describe procedures for robust predictive inference, where a model provides uncertainty estimates on its predictions rather than point predictions.
We present a method that produces prediction sets (almost exactly) giving the right coverage level for any test distribution in an $f$-divergence ball around the training population.
An essential component of our methodology is to estimate the amount of expected future data shift and build robustness to it.
arXiv Detail & Related papers (2020-08-10T17:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.