Towards Calibrating Prompt Tuning of Vision-Language Models
- URL: http://arxiv.org/abs/2602.19024v1
- Date: Sun, 22 Feb 2026 03:26:23 GMT
- Title: Towards Calibrating Prompt Tuning of Vision-Language Models
- Authors: Ashshak Sharifdeen, Fahad Shamshad, Muhammad Akhtar Munir, Abhishek Basu, Mohamed Insaf Ismithdeen, Jeyapriyan Jeyamohan, Chathurika Sewwandi Silva, Karthik Nandakumar, Muhammad Haris Khan,
- Abstract summary: We propose a calibration framework that enhances predictive reliability while preserving the geometry of the pretrained CLIP embedding space.<n>Our approach significantly reduces the Expected Error (ECE) compared to competitive calibration techniques on both base and novel classes.
- Score: 40.60254526955107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt tuning of large-scale vision-language models such as CLIP enables efficient task adaptation without updating model weights. However, it often leads to poor confidence calibration and unreliable predictive uncertainty. We address this problem by proposing a calibration framework that enhances predictive reliability while preserving the geometry of the pretrained CLIP embedding space, which is required for robust generalization. Our approach extends the standard cross-entropy loss with two complementary regularizers: (1) a mean-variance margin penalty that stabilizes inter-class logit margins by maximizing their average while minimizing dispersion, mitigating underconfidence and overconfidence spikes; and (2) a text moment-matching loss that aligns the first and second moments of tuned text embeddings with their frozen CLIP counterparts, preserving semantic dispersion crucial for generalization. Through extensive experiments across 7 prompt-tuning methods and 11 diverse datasets, we demonstrate that our approach significantly reduces the Expected Calibration Error (ECE) compared to competitive calibration techniques on both base and novel classes
Related papers
- LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs [61.06744611795341]
Medical vision-language models (VLMs) are strong zero-shot recognizers for medical imaging.<n>We propose texttttextbfLATA (Laplacian-Assisted Transductive Adaptation), a textittraining- and label-free refinement.<n>texttttextbfLATA sharpens zero-shot predictions without compromising exchangeability.
arXiv Detail & Related papers (2026-02-19T16:45:38Z) - Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics [6.9681910774977815]
This paper presents a post-hoc calibration framework to enhance calibration quality and uncertainty-aware decision-making.<n>A comprehensive evaluation is conducted using calibration metrics, uncertainty-aware performance measures, and empirical conformal coverage.<n> Experiments show that the proposed method achieves lower confidently incorrect predictions, and competitive Expected Error compared with isotonic and focal-loss baselines.
arXiv Detail & Related papers (2025-10-19T23:55:36Z) - Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations [67.35596444651037]
Vision-language models (VLMs) exhibit remarkable zero-shot capabilities but struggle with distribution shifts in downstream tasks when labeled data is unavailable.<n>We propose a Reliable Test-time Adaptation (ReTA) method that enhances reliability from two perspectives.
arXiv Detail & Related papers (2025-07-13T05:37:33Z) - CLUE: Neural Networks Calibration via Learning Uncertainty-Error alignment [7.702016079410588]
We introduce CLUE (Calibration via Learning Uncertainty-Error Alignment), a novel approach that aligns predicted uncertainty with observed error during training.<n>We show that CLUE achieves superior calibration quality and competitive predictive performance with respect to state-of-the-art approaches.
arXiv Detail & Related papers (2025-05-28T19:23:47Z) - Adaptive Set-Mass Calibration with Conformal Prediction [60.47079469141295]
We develop a new calibration procedure that starts with conformal prediction to obtain a set of labels that gives the desired coverage.<n>We then instantiate two simple post-hoc calibrators: a mass normalization and a temperature scaling-based rule, tuned to the conformal constraint.
arXiv Detail & Related papers (2025-05-21T12:18:15Z) - Calibration Strategies for Robust Causal Estimation: Theoretical and Empirical Insights on Propensity Score-Based Estimators [0.6562256987706128]
partitioning of data for estimation and calibration critically impacts the performance of propensity score based estimators.<n>We extend recent advances in calibration techniques for propensity score estimation, improving the robustness of propensity scores in challenging settings.
arXiv Detail & Related papers (2025-03-21T16:41:10Z) - Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We present a novel variational formulation of the calibration-refinement decomposition.<n>We provide theoretical and empirical evidence that calibration and refinement errors are not minimized simultaneously during training.
arXiv Detail & Related papers (2025-01-31T15:03:54Z) - A Confidence Interval for the $\ell_2$ Expected Calibration Error [35.88784957918326]
We develop confidence intervals $ell$ Expected the Error (ECE)<n>We consider top-1-to-$k$ calibration, which includes both the popular notion of confidence calibration as well as calibration.<n>For a debiased estimator of the ECE, we show normality, but with different convergence rates and variances for calibrated and misd models.
arXiv Detail & Related papers (2024-08-16T20:00:08Z) - Towards Calibrated Deep Clustering Network [60.71776081164377]
In deep clustering, the estimated confidence for a sample belonging to a particular cluster greatly exceeds its actual prediction accuracy.<n>We propose a novel dual head (calibration head and clustering head) deep clustering model that can effectively calibrate the estimated confidence and the actual accuracy.<n>The proposed calibrated deep clustering model not only surpasses the state-of-the-art deep clustering methods by 5x on average in terms of expected calibration error, but also significantly outperforms them in terms of clustering accuracy.
arXiv Detail & Related papers (2024-03-04T11:23:40Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Calibration-Aware Bayesian Learning [37.82259435084825]
This paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs)
It applies both data-dependent or data-independent regularizers while optimizing over a variational distribution as in Bayesian learning.
Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
arXiv Detail & Related papers (2023-05-12T14:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.