Related papers: Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning

Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning

URL: http://arxiv.org/abs/2310.04627v1
Date: Fri, 6 Oct 2023 23:46:33 GMT
Title: Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning
Authors: Liam Collins, Shanshan Wu, Sewoong Oh, Khe Chai Sim
Abstract summary: In many applications of federated learning (FL), clients desire models that are personalized using their local data, yet are also robust in the sense that they retain general global knowledge. It is critical to understand how to navigate this personalization vs robustness trade-off when designing federated systems.
Score: 40.16581292336117
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In many applications of federated learning (FL), clients desire models that are personalized using their local data, yet are also robust in the sense that they retain general global knowledge. However, the presence of data heterogeneity across clients induces a fundamental trade-off between personalization (i.e., adaptation to a local distribution) and robustness (i.e., not forgetting previously learned general knowledge). It is critical to understand how to navigate this personalization vs robustness trade-off when designing federated systems, which are increasingly moving towards a paradigm of fine-tuning large foundation models. Due to limited computational and communication capabilities in most federated settings, this foundation model fine-tuning must be done using parameter-efficient fine-tuning (PEFT) approaches. While some recent work has studied federated approaches to PEFT, the personalization vs robustness trade-off of federated PEFT has been largely unexplored. In this work, we take a step towards bridging this gap by benchmarking fundamental FL algorithms -- FedAvg and FedSGD plus personalization (via client local fine-tuning) -- applied to one of the most ubiquitous PEFT approaches to large language models (LLMs) -- prompt tuning -- in a multitude of hyperparameter settings under varying levels of data heterogeneity. Our results show that federated-trained prompts can be surprisingly robust when using a small learning rate with many local epochs for personalization, especially when using an adaptive optimizer as the client optimizer during federated training. We also demonstrate that simple approaches such as adding regularization and interpolating two prompts are effective in improving the personalization vs robustness trade-off in computation-limited settings with few local updates allowed for personalization.

Related papers

Client-Centric Federated Adaptive Optimization [78.30827455292827]
Federated Learning (FL) is a distributed learning paradigm where clients collaboratively train a model while keeping their own data private. We propose Federated-Centric Adaptive Optimization, which is a class of novel federated optimization approaches.
arXiv Detail & Related papers (2025-01-17T04:00:50Z)
Dual-Personalizing Adapter for Federated Foundation Models [35.863585349109385]
We propose a Federated Dual-Personalizing Adapter architecture to handle test-time distribution shifts simultaneously. The effectiveness of the proposed method has been evaluated on benchmark datasets across different NLP tasks.
arXiv Detail & Related papers (2024-03-28T08:19:33Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning [37.96957782129352]
We propose a finetuning framework tailored to heterogeneous multi-modal foundation models, called Federated Dual-Aadapter Teacher (Fed DAT) Fed DAT addresses data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer. To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity.
arXiv Detail & Related papers (2023-08-21T21:57:01Z)
FedJETs: Efficient Just-In-Time Personalization with Federated Mixture of Experts [48.78037006856208]
FedJETs is a novel solution by using a Mixture-of-Experts (MoE) framework within a Federated Learning (FL) setup. Our method leverages the diversity of the clients to train specialized experts on different subsets of classes, and a gating function to route the input to the most relevant expert(s) Our approach can improve accuracy up to 18% in state of the art FL settings, while maintaining competitive zero-shot performance.
arXiv Detail & Related papers (2023-06-14T15:47:52Z)
Privacy Preserving Bayesian Federated Learning in Heterogeneous Settings [20.33482170846688]
This paper presents a unified federated learning framework based on customized local Bayesian models that learn well even in the absence of large local datasets. We use priors in the functional (output) space of the networks to facilitate collaboration across heterogeneous clients. Experiments on standard FL datasets demonstrate that our approach outperforms strong baselines in both homogeneous and heterogeneous settings.
arXiv Detail & Related papers (2023-06-13T17:55:30Z)
Efficient Personalized Federated Learning via Sparse Model-Adaptation [47.088124462925684]
Federated Learning (FL) aims to train machine learning models for multiple clients without sharing their own private data. We propose pFedGate for efficient personalized FL by adaptively and efficiently learning sparse local models. We show that pFedGate achieves superior global accuracy, individual accuracy and efficiency simultaneously over state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T12:21:34Z)
FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction [48.85303253333453]
Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data. We propose a novel federated learning algorithm with local drift decoupling and correction (FedDC) Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters. Experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks.
arXiv Detail & Related papers (2022-03-22T14:06:26Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
PFA: Privacy-preserving Federated Adaptation for Effective Model Personalization [6.66389628571674]
Federated learning (FL) has become a prevalent distributed machine learning paradigm with improved privacy. This paper introduces a new concept called federated adaptation, targeting at adapting the trained model in a federated manner to achieve better personalization results. We propose PFA, a framework to accomplish Privacy-preserving Federated Adaptation.
arXiv Detail & Related papers (2021-03-02T08:07:34Z)
Toward Understanding the Influence of Individual Clients in Federated Learning [52.07734799278535]
Federated learning allows clients to jointly train a global model without sending their private data to a central server. We defined a new notion called em-Influence, quantify this influence over parameters, and proposed an effective efficient model to estimate this metric.
arXiv Detail & Related papers (2020-12-20T14:34:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.