Conquering the Communication Constraints to Enable Large Pre-Trained Models in Federated Learning
- URL: http://arxiv.org/abs/2210.01708v4
- Date: Wed, 23 Oct 2024 12:50:18 GMT
- Title: Conquering the Communication Constraints to Enable Large Pre-Trained Models in Federated Learning
- Authors: Guangyu Sun, Umar Khalid, Matias Mendieta, Taojiannan Yang, Pu Wang, Minwoo Lee, Chen Chen,
- Abstract summary: Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices.
Recent state-of-the-art pre-trained models are getting more capable but also have more parameters.
Can we find a solution to enable those strong and readily-available pre-trained models in FL to achieve excellent performance while simultaneously reducing the communication burden?
Specifically, we systemically evaluate the performance of FedPEFT across a variety of client stability, data distribution, and differential privacy settings.
- Score: 18.12162136918301
- License:
- Abstract: Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices. In the typical FL paradigm (e.g., FedAvg), model weights are sent to and from the server each round to participating clients. Recently, the use of small pre-trained models has been shown effective in federated learning optimization and improving convergence. However, recent state-of-the-art pre-trained models are getting more capable but also have more parameters. In conventional FL, sharing the enormous model weights can quickly put a massive communication burden on the system, especially if more capable models are employed. Can we find a solution to enable those strong and readily-available pre-trained models in FL to achieve excellent performance while simultaneously reducing the communication burden? To this end, we investigate the use of parameter-efficient fine-tuning in federated learning and thus introduce a new framework: FedPEFT. Specifically, we systemically evaluate the performance of FedPEFT across a variety of client stability, data distribution, and differential privacy settings. By only locally tuning and globally sharing a small portion of the model weights, significant reductions in the total communication overhead can be achieved while maintaining competitive or even better performance in a wide range of federated learning scenarios, providing insight into a new paradigm for practical and effective federated systems.
Related papers
- Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning [33.88701368538447]
We propose an innovative model-based local training technique called Local Superior Soups''
Our method enhances local training across different clients, encouraging the exploration of a connected low-loss basin.
We demonstrated its effectiveness and efficiency across diverse widely-used FL datasets.
arXiv Detail & Related papers (2024-10-31T06:20:17Z) - A Survey on Efficient Federated Learning Methods for Foundation Model Training [62.473245910234304]
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
In the wake of Foundation Models (FM), the reality is different for many deep learning applications.
We discuss the benefits and drawbacks of parameter-efficient fine-tuning (PEFT) for FL applications.
arXiv Detail & Related papers (2024-01-09T10:22:23Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Tunable Soft Prompts are Messengers in Federated Learning [55.924749085481544]
Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources.
The lack of model privacy protection in FL becomes an unneglectable challenge.
We propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
arXiv Detail & Related papers (2023-11-12T11:01:10Z) - Towards More Suitable Personalization in Federated Learning via
Decentralized Partial Model Training [67.67045085186797]
Almost all existing systems have to face large communication burdens if the central FL server fails.
It personalizes the "right" in the deep models by alternately updating the shared and personal parameters.
To further promote the shared parameters aggregation process, we propose DFed integrating the local Sharpness Miniization.
arXiv Detail & Related papers (2023-05-24T13:52:18Z) - Federated Learning from Pre-Trained Models: A Contrastive Learning
Approach [43.893267526525904]
Federated Learning (FL) is a machine learning paradigm that allows decentralized clients to learn collaboratively without sharing their private data.
Excessive computation and communication demands pose challenges to current FL frameworks.
We propose a lightweight framework where clients jointly learn to fuse the representations generated by multiple fixed pre-trained models.
arXiv Detail & Related papers (2022-09-21T03:16:57Z) - FedOBD: Opportunistic Block Dropout for Efficiently Training Large-scale
Neural Networks through Federated Learning [18.357577491590686]
We propose the Federated Opportunistic Block Dropout (FedOBD) approach to train large-scale neural networks.
FedOBD decomposes large-scale models into semantic blocks so that FL participants can opportunistically upload quantized blocks.
Experiments show that FedOBD reduces the overall communication overhead by more than 88% compared to the best performing baseline approach.
arXiv Detail & Related papers (2022-08-10T06:36:49Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Heterogeneous Ensemble Knowledge Transfer for Training Large Models in
Federated Learning [22.310090483499035]
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server.
Most existing FL algorithms require models of identical architecture to be deployed across the clients and server.
We propose a novel ensemble knowledge transfer method named Fed-ET in which small models are trained on clients, and used to train a larger model at the server.
arXiv Detail & Related papers (2022-04-27T05:18:32Z) - Federated Residual Learning [53.77128418049985]
We study a new form of federated learning where the clients train personalized local models and make predictions jointly with the server-side shared model.
Using this new federated learning framework, the complexity of the central shared model can be minimized while still gaining all the performance benefits that joint training provides.
arXiv Detail & Related papers (2020-03-28T19:55:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.