Related papers: TRUST: Test-time Resource Utilization for Superior Trustworthiness

TRUST: Test-time Resource Utilization for Superior Trustworthiness

URL: http://arxiv.org/abs/2506.06048v1
Date: Fri, 06 Jun 2025 12:52:32 GMT
Title: TRUST: Test-time Resource Utilization for Superior Trustworthiness
Authors: Haripriya Harikumar, Santu Rana,
Abstract summary: We propose a novel test-time optimization method that accounts for the impact of such noise to produce more reliable confidence estimates.<n>This score defines a monotonic subset-selection function, where population accuracy consistently increases as samples with lower scores are removed.
Score: 15.031121920821109
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Standard uncertainty estimation techniques, such as dropout, often struggle to clearly distinguish reliable predictions from unreliable ones. We attribute this limitation to noisy classifier weights, which, while not impairing overall class-level predictions, render finer-level statistics less informative. To address this, we propose a novel test-time optimization method that accounts for the impact of such noise to produce more reliable confidence estimates. This score defines a monotonic subset-selection function, where population accuracy consistently increases as samples with lower scores are removed, and it demonstrates superior performance in standard risk-based metrics such as AUSE and AURC. Additionally, our method effectively identifies discrepancies between training and test distributions, reliably differentiates in-distribution from out-of-distribution samples, and elucidates key differences between CNN and ViT classifiers across various vision datasets.

Related papers

Generative Conformal Prediction with Vectorized Non-Conformity Scores [6.059745771017814]
Conformal prediction provides model-agnostic uncertainty quantification with guaranteed coverage.<n>We propose a generative conformal prediction framework with vectorized non-conformity scores.<n>We construct adaptive uncertainty sets using density-ranked uncertainty balls.
arXiv Detail & Related papers (2024-10-17T16:37:03Z)
Semi-supervised Learning For Robust Speech Evaluation [30.593420641501968]
Speech evaluation measures a learners oral proficiency using automatic models. This paper proposes to address such challenges by exploiting semi-supervised pre-training and objective regularization. An anchor model is trained using pseudo labels to predict the correctness of pronunciation.
arXiv Detail & Related papers (2024-09-23T02:11:24Z)
Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample. Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications. We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z)
Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification. We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate. We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z)
Towards Open-Set Test-Time Adaptation Utilizing the Wisdom of Crowds in Entropy Minimization [47.61333493671805]
Test-time adaptation (TTA) methods rely on the model's predictions to adapt the source pretrained model to the unlabeled target domain. We propose a simple yet effective sample selection method inspired by the following crucial empirical finding.
arXiv Detail & Related papers (2023-08-14T01:24:18Z)
Calibrating Deep Neural Networks using Explicit Regularisation and Dynamic Data Pruning [25.982037837953268]
Deep neural networks (DNN) are prone to miscalibrated predictions, often exhibiting a mismatch between the predicted output and the associated confidence scores. We propose a novel regularization technique that can be used with classification losses, leading to state-of-the-art calibrated predictions at test time.
arXiv Detail & Related papers (2022-12-20T05:34:58Z)
Adaptive Dimension Reduction and Variational Inference for Transductive Few-Shot Classification [2.922007656878633]
We propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction. Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks.
arXiv Detail & Related papers (2022-09-18T10:29:02Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels. Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.