Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
- URL: http://arxiv.org/abs/2410.10114v2
- Date: Wed, 16 Oct 2024 12:30:53 GMT
- Title: Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
- Authors: Jun Luo, Chen Chen, Shandong Wu,
- Abstract summary: We propose a novel framework that personalizes the prompt learning process through the lens of Mixture of Experts (MoE)
pFedMoAP implements a local attention-based gating network that learns to generate enhanced text features for better alignment with local image data on the client.
The results show that pFedMoAP consistently outperforms the state-of-the-art alternatives, underscoring its efficacy in personalizing prompt learning for CLIP.
- Score: 7.810284483002312
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt learning for pre-trained Vision-Language Models (VLMs) like CLIP has demonstrated potent applicability across diverse downstream tasks. This lightweight approach has quickly gained traction from federated learning (FL) researchers who seek to efficiently adapt VLMs to heterogeneous scenarios. However, current federated prompt learning methods are habitually restricted to the traditional FL paradigm, where the participating clients are generally only allowed to download a single globally aggregated model from the server. While justifiable for training full-sized models under federated settings, in this work, we argue that this paradigm is ill-suited for lightweight prompts. By facilitating the clients to download multiple pre-aggregated prompts as fixed non-local experts, we propose Personalized Federated Mixture of Adaptive Prompts (pFedMoAP), a novel FL framework that personalizes the prompt learning process through the lens of Mixture of Experts (MoE). pFedMoAP implements a local attention-based gating network that learns to generate enhanced text features for better alignment with local image data on the client, benefiting from both local and downloaded non-local adaptive prompt experts. The non-local experts are sparsely selected from a server-maintained pool, fostering collaborative learning across clients. To evaluate the proposed algorithm, we conduct extensive experiments across 9 datasets under various heterogeneous federated settings. The results show that pFedMoAP consistently outperforms the state-of-the-art alternatives, underscoring its efficacy in personalizing prompt learning for CLIP within the federated learning paradigm.
Related papers
- Open-Vocabulary Federated Learning with Multimodal Prototyping [14.95283651408951]
We present a novel adaptation framework tailored for vision-language models (VLMs) in the context of federated learning (FL)
Fed-MP adaptively aggregates the local model weights based on light-weight client residuals, and makes predictions based on a novel multimodal prototyping mechanism.
Our empirical evaluation on various datasets validates the effectiveness of Fed-MP.
arXiv Detail & Related papers (2024-04-01T16:51:13Z) - Tunable Soft Prompts are Messengers in Federated Learning [55.924749085481544]
Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources.
The lack of model privacy protection in FL becomes an unneglectable challenge.
We propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
arXiv Detail & Related papers (2023-11-12T11:01:10Z) - Unlocking the Potential of Prompt-Tuning in Bridging Generalized and
Personalized Federated Learning [49.72857433721424]
Vision Transformers (ViT) and Visual Prompt Tuning (VPT) achieve state-of-the-art performance with improved efficiency in various computer vision tasks.
We present a novel algorithm, SGPT, that integrates Generalized FL (GFL) and Personalized FL (PFL) approaches by employing a unique combination of both shared and group-specific prompts.
arXiv Detail & Related papers (2023-10-27T17:22:09Z) - Context-Aware Prompt Tuning for Vision-Language Model with
Dual-Alignment [15.180715595425864]
We introduce a novel method to improve the prompt learning of vision-language models by incorporating pre-trained large language models (LLMs)
With DuAl-PT, we propose to learn more context-aware prompts, benefiting from both explicit and implicit context modeling.
Empirically, DuAl-PT achieves superior performance on 11 downstream datasets on few-shot recognition and base-to-new generalization.
arXiv Detail & Related papers (2023-09-08T06:51:15Z) - FedJETs: Efficient Just-In-Time Personalization with Federated Mixture
of Experts [48.78037006856208]
FedJETs is a novel solution by using a Mixture-of-Experts (MoE) framework within a Federated Learning (FL) setup.
Our method leverages the diversity of the clients to train specialized experts on different subsets of classes, and a gating function to route the input to the most relevant expert(s)
Our approach can improve accuracy up to 18% in state of the art FL settings, while maintaining competitive zero-shot performance.
arXiv Detail & Related papers (2023-06-14T15:47:52Z) - Personalized Federated Learning with Local Attention [5.018560254008613]
Federated Learning (FL) aims to learn a single global model that enables the central server to help the model training in local clients without accessing their local data.
Key challenge of FL is the heterogeneous label distribution and feature shift, which could lead to significant performance degradation of the learned models.
We propose a simple yet effective algorithm, namely textbfpersonalized textbfFederated learning with textbfLocal textbfAttention (pFedLA)
Two modules are proposed in pFedLA, i.e., the personalized
arXiv Detail & Related papers (2023-04-02T20:10:32Z) - Visual Prompt Based Personalized Federated Learning [83.04104655903846]
We propose a novel PFL framework for image classification tasks, dubbed pFedPT, that leverages personalized visual prompts to implicitly represent local data distribution information of clients.
Experiments on the CIFAR10 and CIFAR100 datasets show that pFedPT outperforms several state-of-the-art (SOTA) PFL algorithms by a large margin in various settings.
arXiv Detail & Related papers (2023-03-15T15:02:15Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - OpenPrompt: An Open-source Framework for Prompt-learning [59.17869696803559]
We present OpenPrompt, a unified easy-to-use toolkit to conduct prompt-learning over PLMs.
OpenPrompt is a research-friendly framework that is equipped with efficiency, modularity, and extendibility.
arXiv Detail & Related papers (2021-11-03T03:31:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.