Related papers: Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning

Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning

URL: http://arxiv.org/abs/2602.18867v1
Date: Sat, 21 Feb 2026 15:21:54 GMT
Title: Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning
Authors: Zhuofan Xie, Zishan Lin, Jinliang Lin, Jie Qi, Shaohua Hong, Shuo Li,
Abstract summary: Similarity-as-Evidence (SaE) calibrates text-image similarities by introducing a Similarity Evidence Head (SEH)<n>SaE attains state-of-the-art macro-averaged accuracy of 82.57% on medical imaging datasets with a 20% label budget.
Score: 10.264467364282865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Active Learning (AL) reduces annotation costs in medical imaging by selecting only the most informative samples for labeling, but suffers from cold-start when labeled data are scarce. Vision-Language Models (VLMs) address the cold-start problem via zero-shot predictions, yet their temperature-scaled softmax outputs treat text-image similarities as deterministic scores while ignoring inherent uncertainty, leading to overconfidence. This overconfidence misleads sample selection, wasting annotation budgets on uninformative cases. To overcome these limitations, the Similarity-as-Evidence (SaE) framework calibrates text-image similarities by introducing a Similarity Evidence Head (SEH), which reinterprets the similarity vector as evidence and parameterizes a Dirichlet distribution over labels. In contrast to a standard softmax that enforces confident predictions even under weak signals, the Dirichlet formulation explicitly quantifies lack of evidence (vacuity) and conflicting evidence (dissonance), thereby mitigating overconfidence caused by rigid softmax normalization. Building on this, SaE employs a dual-factor acquisition strategy: high-vacuity samples (e.g., rare diseases) are prioritized in early rounds to ensure coverage, while high-dissonance samples (e.g., ambiguous diagnoses) are prioritized later to refine boundaries, providing clinically interpretable selection rationales. Experiments on ten public medical imaging datasets with a 20% label budget show that SaE attains state-of-the-art macro-averaged accuracy of 82.57%. On the representative BTMRI dataset, SaE also achieves superior calibration, with a negative log-likelihood (NLL) of 0.425.

Related papers

From Calibration to Refinement: Seeking Certainty via Probabilistic Evidence Propagation for Noisy-Label Person Re-Identification [40.73759251488672]
Existing noise-robust person Re-ID methods rely on loss-correction or sample-selection strategies using softmax outputs.<n>We propose the CAlibration-to-REfinement (CARE) method, a two-stage framework that seeks certainty through probabilistic evidence propagation from calibration to refinement.<n>In the refinement stage, we design the evidence propagation refinement (EPR) that can more accurately distinguish between clean and noisy samples.
arXiv Detail & Related papers (2026-02-26T15:50:15Z)
LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs [61.06744611795341]
Medical vision-language models (VLMs) are strong zero-shot recognizers for medical imaging.<n>We propose texttttextbfLATA (Laplacian-Assisted Transductive Adaptation), a textittraining- and label-free refinement.<n>texttttextbfLATA sharpens zero-shot predictions without compromising exchangeability.
arXiv Detail & Related papers (2026-02-19T16:45:38Z)
X-Mark: Saliency-Guided Robust Dataset Ownership Verification for Medical Imaging [67.85884025186755]
High-quality medical imaging datasets are essential for training deep learning models, but their unauthorized use raises serious copyright and ethical concerns.<n>Medical imaging presents a unique challenge for existing dataset ownership verification methods designed for natural images.<n>We propose X-Mark, a sample-specific clean-label watermarking method for chest x-ray copyright protection.
arXiv Detail & Related papers (2026-02-10T00:03:43Z)
Boundary-Aware Adversarial Filtering for Reliable Diagnosis under Extreme Class Imbalance [1.2948544197525087]
We propose AF-SMOTE, a mathematically motivated augmentation framework that first synthesizes minority points and then filters them by an adversarial discriminator and a boundary utility model.<n>We prove that, under mild assumptions on the decision boundary smoothness and class-conditional densities, our filtering step monotonically improves a surrogate of F_beta.<n>On MIMIC-IV proxy label prediction and canonical fraud detection benchmarks, AF-SMOTE attains higher recall and average precision than strong oversampling baselines.
arXiv Detail & Related papers (2025-11-19T02:15:58Z)
Label Uncertainty for Ultrasound Segmentation [25.682215047694168]
In medical imaging, inter-observer variability among radiologists often introduces label uncertainty.<n>We introduce a novel approach to both labeling and training AI models using expert-supplied, per-pixel confidence values.
arXiv Detail & Related papers (2025-08-21T15:00:21Z)
SURE-Med: Systematic Uncertainty Reduction for Enhanced Reliability in Medical Report Generation [2.2185034594788164]
We propose SURE-Med, a unified framework that systematically reduces uncertainty across three critical dimensions: visual, distributional, and contextual.<n>To mitigate visual uncertainty, a Frontal-Aware View Repair Resampling module corrects view annotation errors and adaptively selects informative features from supplementary views.<n>To tackle label distribution uncertainty, we introduce a Token Sensitive Learning objective that enhances the modeling of critical diagnostic sentences.<n>To reduce contextual uncertainty, our Contextual Evidence Filter validates and selectively incorporates prior information that aligns with the current image, effectively suppressing hallucinations.
arXiv Detail & Related papers (2025-08-03T09:52:30Z)
Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative. We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z)
Towards Reliable Medical Image Segmentation by Modeling Evidential Calibrated Uncertainty [57.023423137202485]
Concerns regarding the reliability of medical image segmentation persist among clinicians.<n>We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.<n>By leveraging subjective logic theory, we explicitly model probability and uncertainty for medical image segmentation.
arXiv Detail & Related papers (2023-01-01T05:02:46Z)
Improving group robustness under noisy labels using predictive uncertainty [0.9449650062296823]
We use the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels. We propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels.
arXiv Detail & Related papers (2022-12-14T04:40:50Z)
Taming Overconfident Prediction on Unlabeled Data from Hindsight [50.9088560433925]
Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning. This paper proposes a dual mechanism, named ADaptive Sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions. ADS significantly improves the state-of-the-art SSL methods by making it a plug-in.
arXiv Detail & Related papers (2021-12-15T15:17:02Z)
Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets. A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies. A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z)
Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification. It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.