Fugu-MT 論文翻訳(概要): Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers

論文の概要: Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers

arxiv url: http://arxiv.org/abs/2510.25372v1
Date: Wed, 29 Oct 2025 10:42:56 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-30 15:50:45.403159
Title: Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers
Title（参考訳）: 視覚変換器のフェデレーションプロンプトチューニングのためのプロトタイプからのプロンプト推定
Authors: M Yashwanth, Sharannya Ghosh, Aditay Tripathi, Anirban Chakraborty,
Abstract要約: PEP-FedPT (Prototypes for Federated Prompt Tuning) を提案し、視覚変換器(ViT)の視覚的プロンプトチューニングにおける一般化とパーソナライズを実現する。グローバルに共有されるプロンプトとともに保持されるクラス固有のプロンプトに基づいて,新しいクラス-コンテキスト化混合プロンプト(CCMP)を導入する。 PEP-FedPTは、さまざまなデータシナリオの下で、最先端のベースラインを一貫して上回る。
参考スコア（独自算出の注目度）: 5.231417382224748
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual Prompt Tuning (VPT) of pre-trained Vision Transformers (ViTs) has proven highly effective as a parameter-efficient fine-tuning technique for adapting large models to downstream tasks with limited data. Its parameter efficiency makes it particularly suitable for Federated Learning (FL), where both communication and computation budgets are often constrained. However, global prompt tuning struggles to generalize across heterogeneous clients, while personalized tuning overfits to local data and lacks generalization. We propose PEP-FedPT (Prompt Estimation from Prototypes for Federated Prompt Tuning), a unified framework designed to achieve both generalization and personalization in federated prompt tuning of ViTs. Within this framework, we introduce the novel Class-Contextualized Mixed Prompt (CCMP) - based on class-specific prompts maintained alongside a globally shared prompt. For each input, CCMP adaptively combines class-specific prompts using weights derived from global class prototypes and client class priors. This approach enables per-sample prompt personalization without storing client-dependent trainable parameters. The prompts are collaboratively optimized via traditional federated averaging technique on the same. Comprehensive evaluations on CIFAR-100, TinyImageNet, DomainNet, and iNaturalist datasets demonstrate that PEP-FedPT consistently surpasses the state-of-the-art baselines under diverse data heterogeneity scenarios, establishing a strong foundation for efficient and generalizable federated prompt tuning of Vision Transformers.
Abstract（参考訳）: 事前訓練された視覚変換器(ViT)の視覚プロンプトチューニング(VPT)は、限られたデータで下流タスクに大規模モデルを適用するためのパラメータ効率の良い微調整技術として非常に有効であることが証明されている。そのパラメータ効率は、コミュニケーションと計算の予算が制約されることの多い連邦学習(FL)に特に適しています。しかし、グローバルなプロンプトチューニングは、局所データに合わせたパーソナライズされたチューニングと一般化の欠如を伴い、異種クライアント間の一般化に苦慮している。 PEP-FedPT(Prototypes for Federated Prompt Tuning)を提案する。本フレームワークでは,クラス固有プロンプトとグローバル共有プロンプトを併用した,新しいクラス固有プロンプト(CCMP)を導入する。各入力に対して、CCMPはグローバルクラスプロトタイプとクライアントクラスプリエントから派生した重みを使って、クラス固有のプロンプトを適応的に結合する。このアプローチは、クライアント依存のトレーニング可能なパラメータを格納することなく、サンプルごとのパーソナライズを可能にする。プロンプトは、従来のフェデレーション平均化技術で協調的に最適化される。 CIFAR-100、TinyImageNet、DomainNet、およびiNaturalistデータセットに関する総合的な評価は、PEP-FedPTが多種多様なデータ異種性シナリオにおける最先端のベースラインを一貫して上回り、ビジョントランスフォーマーの効率的で一般化可能な連邦化プロンプトチューニングのための強力な基盤を確立していることを示している。

論文の概要: Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers

関連論文リスト