Doubly Debiased Test-Time Prompt Tuning for Vision-Language Models
- URL: http://arxiv.org/abs/2511.11690v1
- Date: Wed, 12 Nov 2025 09:35:31 GMT
- Title: Doubly Debiased Test-Time Prompt Tuning for Vision-Language Models
- Authors: Fei Song, Yi Li, Rui Wang, Jiahuan Zhou, Changwen Zheng, Jiangmeng Li,
- Abstract summary: Test-time prompt tuning for vision-language models has demonstrated impressive generalization capabilities under zero-shot settings.<n>We analyze the underlying causes of prompt optimization bias from both the model and data perspectives.
- Score: 43.35073848153914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Test-time prompt tuning for vision-language models has demonstrated impressive generalization capabilities under zero-shot settings. However, tuning the learnable prompts solely based on unlabeled test data may induce prompt optimization bias, ultimately leading to suboptimal performance on downstream tasks. In this work, we analyze the underlying causes of prompt optimization bias from both the model and data perspectives. In terms of the model, the entropy minimization objective typically focuses on reducing the entropy of model predictions while overlooking their correctness. This can result in overconfident yet incorrect outputs, thereby compromising the quality of prompt optimization. On the data side, prompts affected by optimization bias can introduce misalignment between visual and textual modalities, which further aggravates the prompt optimization bias. To this end, we propose a Doubly Debiased Test-Time Prompt Tuning method. Specifically, we first introduce a dynamic retrieval-augmented modulation module that retrieves high-confidence knowledge from a dynamic knowledge base using the test image feature as a query, and uses the retrieved knowledge to modulate the predictions. Guided by the refined predictions, we further develop a reliability-aware prompt optimization module that incorporates a confidence-based weighted ensemble and cross-modal consistency distillation to impose regularization constraints during prompt tuning. Extensive experiments across 15 benchmark datasets involving both natural distribution shifts and cross-datasets generalization demonstrate that our method outperforms baselines, validating its effectiveness in mitigating prompt optimization bias.
Related papers
- APAO: Adaptive Prefix-Aware Optimization for Generative Recommendation [26.371939617653084]
Generative recommendation is an autoregressive generation process, predicting discrete tokens of the next item conditioned on user interaction histories.<n>Existing generative recommendation models are typically trained with token-level likelihood objectives, such as cross-entropy loss.<n>This leads to a training-inference inconsistency: standard training assumes ground-truth history is always available, ignoring the fact that beam search prunes low-probability branches during inference.
arXiv Detail & Related papers (2026-03-03T08:29:15Z) - Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations [67.35596444651037]
Vision-language models (VLMs) exhibit remarkable zero-shot capabilities but struggle with distribution shifts in downstream tasks when labeled data is unavailable.<n>We propose a Reliable Test-time Adaptation (ReTA) method that enhances reliability from two perspectives.
arXiv Detail & Related papers (2025-07-13T05:37:33Z) - O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models [17.56932003351322]
Test-time prompt tuning for vision-language models (VLMs) is getting attention because of their ability to learn with unlabeled data without fine-tuning.<n>The resulting models tend to demonstrate poor calibration, which casts doubts on the reliability and trustworthiness of these models.<n>We propose a new approach, called O-TPT, that introduces orthogonality constraints on the textual features corresponding to the learnable prompts.
arXiv Detail & Related papers (2025-03-15T11:45:54Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models [4.655740975414312]
This paper introduces Test-Time Low-rank adaptation (TTL) as an alternative to prompt tuning for zero-shot generalizations of large-scale vision-language models (VLMs)
TTL offers a test-time-efficient adaptation approach that updates the attention weights of the transformer by maximizing prediction confidence.
arXiv Detail & Related papers (2024-07-22T17:59:19Z) - Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction [56.17020601803071]
Recent research shows that pre-trained language models (PLMs) suffer from "prompt bias" in factual knowledge extraction.
This paper aims to improve the reliability of existing benchmarks by thoroughly investigating and mitigating prompt bias.
arXiv Detail & Related papers (2024-03-15T02:04:35Z) - Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization [16.57676001669012]
In data-driven optimization, the sample performance of the obtained decision typically incurs an optimistic bias against the true performance.<n>Common techniques to correct this bias, such as cross-validation, require repeatedly solving additional optimization problems and are therefore expensive.<n>We develop a general bias correction approach that directly approximates the first-order bias and does not require solving any additional optimization problems.
arXiv Detail & Related papers (2023-06-16T07:07:58Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Fast Rates for Contextual Linear Optimization [52.39202699484225]
We show that a naive plug-in approach achieves regret convergence rates that are significantly faster than methods that directly optimize downstream decision performance.
Our results are overall positive for practice: predictive models are easy and fast to train using existing tools, simple to interpret, and, as we show, lead to decisions that perform very well.
arXiv Detail & Related papers (2020-11-05T18:43:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.