Fugu-MT 論文翻訳(概要): Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining

論文の概要: Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining

arxiv url: http://arxiv.org/abs/2604.27715v1
Date: Thu, 30 Apr 2026 11:01:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-01 16:31:54.056466
Title: Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining
Title（参考訳）: データフリーフラットネス・アウェア・プロンプト事前学習による視覚言語モデルのテスト時間プロンプトチューニングにおける校正の改善
Authors: Hyeonseo Jang, Jaebyeong Jeon, Joong-Won Hwang, Kibok Lee,
Abstract要約: テスト時プロンプトチューニング(TPT)は、視覚言語モデルの適応性を高めるための有望な手法として登場した。以前の研究では、PTTは校正の不十分なモデルをしばしば生成し、予測の信頼性に関する懸念を提起している。適応に先立って、損失景観の平坦な領域内でプロンプトを初期化する、TPTのためのシンプルで効果的な事前訓練フレームワークであるFPPを導入する。
参考スコア（独自算出の注目度）: 3.9486037760311725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Test-time prompt tuning (TPT) has emerged as a promising technique for enhancing the adaptability of vision-language models by optimizing textual prompts using unlabeled test data. However, prior studies have observed that TPT often produces poorly calibrated models, raising concerns about the reliability of their predictions. Recent works address this issue by incorporating additional regularization terms that constrain model outputs, which improve calibration but often degrade performance. In this work, we reveal that these regularization strategies implicitly encourage optimization toward flatter minima, and that the sharpness of the loss landscape around adapted prompts is a key factor governing calibration quality. Motivated by this observation, we introduce Flatness-aware Prompt Pretraining (FPP), a simple yet effective pretraining framework for TPT that initializes prompts within flatter regions of the loss landscape prior to adaptation. We show that simply replacing the initialization in existing TPT pipelines--without modifying any other components--is sufficient to improve both calibration and performance. Notably, FPP requires no labeled data and incurs no additional computational costs during test-time tuning, making it highly practical for real-world deployment. The code is available at: https://github.com/YonseiML/fpp.
Abstract（参考訳）: テスト時プロンプトチューニング(TPT)は、未ラベルのテストデータを用いてテキストプロンプトを最適化することにより、視覚言語モデルの適応性を高めるための有望な手法として登場した。しかし、以前の研究では、PTTは校正の不十分なモデルをしばしば生成し、予測の信頼性に関する懸念を提起している。最近の研究は、キャリブレーションを改善するが、しばしば性能を低下させるような、モデル出力を制約する追加の正規化項を導入することでこの問題に対処している。本研究では,これらの正規化戦略が平らなミニマへの最適化を暗黙的に促進し,適応されたプロンプト周辺における損失景観の鋭さがキャリブレーション品質を規定する重要な要因であることを明らかにする。本研究は,TPTのための簡易かつ効果的な事前学習フレームワークであるFPP(Flatness-aware Prompt Pretraining)を導入し,その適用前に損失景観の平坦な領域内でプロンプトを初期化する。既存のTPTパイプラインの初期化を、他のコンポーネントを変更せずに置き換えるだけで、キャリブレーションと性能の両方を改善することができることを示す。特に、FPPはラベル付きデータを必要とせず、テスト時間チューニング中に追加の計算コストを発生させないため、実世界のデプロイに非常に実用的である。コードは、https://github.com/YonseiML/fpp.comで入手できる。

論文の概要: Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining

関連論文リスト