Related papers: Softmax is not Enough (for Adaptive Conformal Classification)

Softmax is not Enough (for Adaptive Conformal Classification)

URL: http://arxiv.org/abs/2602.19498v1
Date: Mon, 23 Feb 2026 04:33:04 GMT
Title: Softmax is not Enough (for Adaptive Conformal Classification)
Authors: Navid Akhavan Attar, Hesam Asadollahzadeh, Ling Luo, Uwe Aickelin,
Abstract summary: Conformal Prediction (CP) is a distribution-free framework for uncertainty quantification.<n>Nonconformity scores are derived from softmax outputs, which can be unreliable indicators of how certain the model truly is about a given input.<n>We propose a new approach that leverages information from the pre-softmax logit space, using the Helmholtz Free Energy as a measure of model uncertainty and sample difficulty.
Score: 5.184894950445513
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The merit of Conformal Prediction (CP), as a distribution-free framework for uncertainty quantification, depends on generating prediction sets that are efficient, reflected in small average set sizes, while adaptive, meaning they signal uncertainty by varying in size according to input difficulty. A central limitation for deep conformal classifiers is that the nonconformity scores are derived from softmax outputs, which can be unreliable indicators of how certain the model truly is about a given input, sometimes leading to overconfident misclassifications or undue hesitation. In this work, we argue that this unreliability can be inherited by the prediction sets generated by CP, limiting their capacity for adaptiveness. We propose a new approach that leverages information from the pre-softmax logit space, using the Helmholtz Free Energy as a measure of model uncertainty and sample difficulty. By reweighting nonconformity scores with a monotonic transformation of the energy score of each sample, we improve their sensitivity to input difficulty. Our experiments with four state-of-the-art score functions on multiple datasets and deep architectures show that this energy-based enhancement improves the adaptiveness of the prediction sets, leading to a notable increase in both efficiency and adaptiveness compared to baseline nonconformity scores, without introducing any post-hoc complexity.

Related papers

ConformalHDC: Uncertainty-Aware Hyperdimensional Computing with Application to Neural Decoding [2.5805874648322664]
We introduce ConformalHDC, a unified framework that combines the statistical guarantees of conformal prediction with the computational efficiency of Hyperdimensional Computing.<n>We show that ConformalHDC not only accurately decodes the stimulus information represented in the neural activity data, but also provides rigorous uncertainty estimates and correctly abstains when presented with data from other behavioral states.
arXiv Detail & Related papers (2026-02-24T23:52:08Z)
DANCE: Doubly Adaptive Neighborhood Conformal Estimation [12.643121779828526]
We propose a doubly locally adaptive nearest-neighbor based conformal algorithm combining two novel nonconformity scores directly using the data's embedded representation.<n>We test against state-of-the-art local, task-adapted and zero-shot conformal baselines, demonstrating DANCE's superior blend of set size efficiency and robustness across various datasets.
arXiv Detail & Related papers (2026-02-24T07:54:53Z)
FIVA: Federated Inverse Variance Averaging for Universal CT Segmentation with Uncertainty Estimation [4.544160712377809]
This work presents a novel federated learning approach to achieve universal segmentation across diverse abdominal CT datasets.<n>The proposed method quantifies prediction uncertainty by propagating the uncertainty from the model weights.<n> Experimental evaluations demonstrate the effectiveness of this approach in improving the quality of federated aggregation and uncertainty-weighted inference.
arXiv Detail & Related papers (2025-08-08T11:34:01Z)
COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z)
SeWA: Selective Weight Average via Probabilistic Masking [51.015724517293236]
We show that only a few points are needed to achieve better and faster convergence.<n>We transform the discrete selection problem into a continuous subset optimization framework.<n>We derive the SeWA's stability bounds, which are sharper than that under both convex image checkpoints.
arXiv Detail & Related papers (2025-02-14T12:35:21Z)
Generative Conformal Prediction with Vectorized Non-Conformity Scores [6.059745771017814]
Conformal prediction provides model-agnostic uncertainty quantification with guaranteed coverage.<n>We propose a generative conformal prediction framework with vectorized non-conformity scores.<n>We construct adaptive uncertainty sets using density-ranked uncertainty balls.
arXiv Detail & Related papers (2024-10-17T16:37:03Z)
Federated Conformal Predictors for Distributed Uncertainty Quantification [83.50609351513886]
Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning. In this paper, we extend conformal prediction to the federated learning setting. We propose a weaker notion of partial exchangeability, better suited to the FL setting, and use it to develop the Federated Conformal Prediction framework.
arXiv Detail & Related papers (2023-05-27T19:57:27Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Adaptive Conformal Prediction by Reweighting Nonconformity Score [0.0]
We use a Quantile Regression Forest (QRF) to learn the distribution of nonconformity scores and utilize the QRF's weights to assign more importance to samples with residuals similar to the test point. Our approach enjoys an assumption-free finite sample marginal and training-conditional coverage, and under suitable assumptions, it also ensures conditional coverage.
arXiv Detail & Related papers (2023-03-22T16:42:19Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.