Learning Confidence for Transformer-based Neural Machine Translation
- URL: http://arxiv.org/abs/2203.11413v1
- Date: Tue, 22 Mar 2022 01:51:58 GMT
- Title: Learning Confidence for Transformer-based Neural Machine Translation
- Authors: Yu Lu, Jiali Zeng, Jiajun Zhang, Shuangzhi Wu and Mu Li
- Abstract summary: We propose an unsupervised confidence estimate learning jointly with the training of the neural machine translation (NMT) model.
We explain confidence as how many hints the NMT model needs to make a correct prediction, and more hints indicate low confidence.
We demonstrate that our learned confidence estimate achieves high accuracy on extensive sentence/word-level quality estimation tasks.
- Score: 38.679505127679846
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Confidence estimation aims to quantify the confidence of the model
prediction, providing an expectation of success. A well-calibrated confidence
estimate enables accurate failure prediction and proper risk measurement when
given noisy samples and out-of-distribution data in real-world settings.
However, this task remains a severe challenge for neural machine translation
(NMT), where probabilities from softmax distribution fail to describe when the
model is probably mistaken. To address this problem, we propose an unsupervised
confidence estimate learning jointly with the training of the NMT model. We
explain confidence as how many hints the NMT model needs to make a correct
prediction, and more hints indicate low confidence. Specifically, the NMT model
is given the option to ask for hints to improve translation accuracy at the
cost of some slight penalty. Then, we approximate their level of confidence by
counting the number of hints the model uses. We demonstrate that our learned
confidence estimate achieves high accuracy on extensive sentence/word-level
quality estimation tasks. Analytical results verify that our confidence
estimate can correctly assess underlying risk in two real-world scenarios: (1)
discovering noisy samples and (2) detecting out-of-domain data. We further
propose a novel confidence-based instance-specific label smoothing approach
based on our learned confidence estimate, which outperforms standard label
smoothing.
Related papers
- Confidence Aware Learning for Reliable Face Anti-spoofing [52.23271636362843]
We propose a Confidence Aware Face Anti-spoofing model, which is aware of its capability boundary.
We estimate its confidence during the prediction of each sample.
Experiments show that the proposed CA-FAS can effectively recognize samples with low prediction confidence.
arXiv Detail & Related papers (2024-11-02T14:29:02Z) - Error-Driven Uncertainty Aware Training [7.702016079410588]
Error-Driven Uncertainty Aware Training aims to enhance the ability of neural classifiers to estimate their uncertainty correctly.
The EUAT approach operates during the model's training phase by selectively employing two loss functions depending on whether the training examples are correctly or incorrectly predicted.
We evaluate EUAT using diverse neural models and datasets in the image recognition domains considering both non-adversarial and adversarial settings.
arXiv Detail & Related papers (2024-05-02T11:48:14Z) - Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - Confidence-Calibrated Face and Kinship Verification [8.570969129199467]
We introduce an effective confidence measure that allows verification models to convert a similarity score into a confidence score for any given face pair.
We also propose a confidence-calibrated approach, termed Angular Scaling (ASC), which is easy to implement and can be readily applied to existing verification models.
To the best of our knowledge, our work presents the first comprehensive confidence-calibrated solution for modern face and kinship verification tasks.
arXiv Detail & Related papers (2022-10-25T10:43:46Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - MACEst: The reliable and trustworthy Model Agnostic Confidence Estimator [0.17188280334580192]
We argue that any confidence estimates based upon standard machine learning point prediction algorithms are fundamentally flawed.
We present MACEst, a Model Agnostic Confidence Estimator, which provides reliable and trustworthy confidence estimates.
arXiv Detail & Related papers (2021-09-02T14:34:06Z) - Harnessing Adversarial Distances to Discover High-Confidence Errors [0.0]
We investigate the problem of finding errors at rates greater than expected given model confidence.
We propose a query-efficient and novel search technique that is guided by adversarial perturbations.
arXiv Detail & Related papers (2020-06-29T13:44:16Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z) - Binary Classification from Positive Data with Skewed Confidence [85.18941440826309]
Positive-confidence (Pconf) classification is a promising weakly-supervised learning method.
In practice, the confidence may be skewed by bias arising in an annotation process.
We introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyper parameter.
arXiv Detail & Related papers (2020-01-29T00:04:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.