Related papers: A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models

A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models

URL: http://arxiv.org/abs/2510.26441v1
Date: Thu, 30 Oct 2025 12:45:24 GMT
Title: A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models
Authors: Shihab Aaqil Ahamed, Udaya S. K. P. Miriya Thanthrige, Ranga Rodrigo, Muhammad Haris Khan,
Abstract summary: Test-time prompt tuning (TPT) has emerged as a promising technique for adapting large vision-language models (VLMs) to unseen tasks without relying on labeled data.<n>We propose A-TPT, a novel TPT framework that introduces angular diversity to encourage uniformity in the distribution of normalized textual features.<n>We show that our approach consistently surpasses state-of-the-art TPT methods in reducing the aggregate average calibration error.
Score: 19.257897956175814
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Test-time prompt tuning (TPT) has emerged as a promising technique for adapting large vision-language models (VLMs) to unseen tasks without relying on labeled data. However, the lack of dispersion between textual features can hurt calibration performance, which raises concerns about VLMs' reliability, trustworthiness, and safety. Current TPT approaches primarily focus on improving prompt calibration by either maximizing average textual feature dispersion or enforcing orthogonality constraints to encourage angular separation. However, these methods may not always have optimal angular separation between class-wise textual features, which implies overlooking the critical role of angular diversity. To address this, we propose A-TPT, a novel TPT framework that introduces angular diversity to encourage uniformity in the distribution of normalized textual features induced by corresponding learnable prompts. This uniformity is achieved by maximizing the minimum pairwise angular distance between features on the unit hypersphere. We show that our approach consistently surpasses state-of-the-art TPT methods in reducing the aggregate average calibration error while maintaining comparable accuracy through extensive experiments with various backbones on different datasets. Notably, our approach exhibits superior zero-shot calibration performance on natural distribution shifts and generalizes well to medical datasets. We provide extensive analyses, including theoretical aspects, to establish the grounding of A-TPT. These results highlight the potency of promoting angular diversity to achieve well-dispersed textual features, significantly improving VLM calibration during test-time adaptation. Our code will be made publicly available.

Related papers

Benchmarking Few-shot Transferability of Pre-trained Models with Improved Evaluation Protocols [123.73663884421272]
Few-shot transfer has been revolutionized by stronger pre-trained models and improved adaptation algorithms.<n>We establish FEWTRANS, a comprehensive benchmark containing 10 diverse datasets.<n>By releasing FEWTRANS, we aim to provide a rigorous "ruler" to streamline reproducible advances in few-shot transfer learning research.
arXiv Detail & Related papers (2026-02-28T05:41:57Z)
Fair Context Learning for Evidence-Balanced Test-Time Adaptation in Vision-Language Models [10.45965859391796]
Test-Time Adaptation (TTA) aims to improve robustness using only unlabeled test samples.<n>Most prompt-based TTA methods rely on entropy minimization.<n>We propose Fair Context Learning (FCL) that avoids entropy minimization by explicitly addressing shared-evidence bias.
arXiv Detail & Related papers (2026-02-02T16:02:50Z)
D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models [5.770351255180494]
Test-time adaptation paradigm provides flexibility towards domain shifts.<n>Vision-Language Models (VLMs) leverage their generalization capabilities for diverse downstream tasks.
arXiv Detail & Related papers (2025-10-10T15:27:44Z)
Test time training enhances in-context learning of nonlinear functions [51.56484100374058]
Test-time training (TTT) enhances model performance by explicitly updating designated parameters prior to each prediction.<n>We investigate the combination of TTT with in-context learning (ICL), where the model is given a few examples from the target distribution at inference time.
arXiv Detail & Related papers (2025-09-30T03:56:44Z)
CLIPTTA: Robust Contrastive Vision-Language Test-Time Adaptation [15.732351927470452]
Vision-language models (VLMs) like CLIP exhibit strong zero-shot capabilities but often fail to generalize under distribution shifts.<n>Test-time adaptation (TTA) allows models to update at inference time without labeled data, typically via entropy minimization.<n>We propose CLIPTTA, a new gradient-based TTA method for vision-language models that leverages a soft contrastive loss aligned with CLIP's pre-training objective.
arXiv Detail & Related papers (2025-07-18T18:32:17Z)
O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models [17.56932003351322]
Test-time prompt tuning for vision-language models (VLMs) is getting attention because of their ability to learn with unlabeled data without fine-tuning.<n>The resulting models tend to demonstrate poor calibration, which casts doubts on the reliability and trustworthiness of these models.<n>We propose a new approach, called O-TPT, that introduces orthogonality constraints on the textual features corresponding to the learnable prompts.
arXiv Detail & Related papers (2025-03-15T11:45:54Z)
C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion [54.81141583427542]
In deep learning, test-time adaptation has gained attention as a method for model fine-tuning without the need for labeled data. This paper explores calibration during test-time prompt tuning by leveraging the inherent properties of CLIP. We present a novel method, Calibrated Test-time Prompt Tuning (C-TPT), for optimizing prompts during test-time with enhanced calibration.
arXiv Detail & Related papers (2024-03-21T04:08:29Z)
Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond [84.56978780892783]
We propose CoupledTPS, which iteratively couples multiple TPS with limited control points into a more flexible and powerful transformation. In light of the laborious annotation cost, we develop a semi-supervised learning scheme to improve warping quality by exploiting unlabeled data. Experiments demonstrate the superiority and universality of CoupledTPS over the existing state-of-the-art solutions for rotation correction.
arXiv Detail & Related papers (2024-01-24T13:03:28Z)
Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning [73.75282761503581]
We propose DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data. Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13%.
arXiv Detail & Related papers (2023-08-11T09:36:31Z)
Fourier Test-time Adaptation with Multi-level Consistency for Robust Classification [10.291631977766672]
We propose a novel approach called Fourier Test-time Adaptation (FTTA) to integrate input and model tuning. FTTA builds a reliable multi-level consistency measurement of paired inputs for achieving self-supervised of prediction. It was extensively validated on three large classification datasets with different modalities and organs.
arXiv Detail & Related papers (2023-06-05T02:29:38Z)
ADEPT: A DEbiasing PrompT Framework [64.54665501064659]
Finetuning is an applicable approach for debiasing contextualized word embeddings.<n> discrete prompts with semantic meanings have shown to be effective in debiasing tasks.<n>We propose ADEPT, a method to debias PLMs using prompt tuning while maintaining the delicate balance between removing biases and ensuring representation ability.
arXiv Detail & Related papers (2022-11-10T08:41:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.