Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization
- URL: http://arxiv.org/abs/2310.15080v3
- Date: Sun, 11 Feb 2024 11:59:52 GMT
- Title: Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization
- Authors: Tianshi Che, Ji Liu, Yang Zhou, Jiaxiang Ren, Jiwen Zhou, Victor S.
Sheng, Huaiyu Dai, Dejing Dou
- Abstract summary: Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data.
The training process of Large Language Models (LLMs) generally incurs the update of significant parameters.
This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
- Score: 71.87335804334616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a promising paradigm to enable collaborative model
training with decentralized data. However, the training process of Large
Language Models (LLMs) generally incurs the update of significant parameters,
which limits the applicability of FL techniques to tackle the LLMs in real
scenarios. Prompt tuning can significantly reduce the number of parameters to
update, but it either incurs performance degradation or low training
efficiency. The straightforward utilization of prompt tuning in the FL often
raises non-trivial communication costs and dramatically degrades performance.
In addition, the decentralized data is generally non-Independent and
Identically Distributed (non-IID), which brings client drift problems and thus
poor performance. This paper proposes a Parameter-efficient prompt Tuning
approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and
effective FL of LLMs. First, an efficient partial prompt tuning approach is
proposed to improve performance and efficiency simultaneously. Second, a novel
adaptive optimization method is developed to address the client drift problems
on both the device and server sides to enhance performance further. Extensive
experiments based on 10 datasets demonstrate the superb performance (up to
60.8\% in terms of accuracy) and efficiency (up to 97.59\% in terms of training
time) of FedPepTAO compared with 9 baseline approaches. Our code is available
at https://github.com/llm-eff/FedPepTAO.
Related papers
- SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead.
Experiments show that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines.
arXiv Detail & Related papers (2024-06-01T13:10:35Z) - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions.
FedKSeed employs zeroth-order optimization with a finite set of random seeds.
It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z) - E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive.
We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation.
Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z) - FedAVO: Improving Communication Efficiency in Federated Learning with
African Vultures Optimizer [0.0]
Federated Learning (FL) is a distributed machine learning technique.
In this paper, we introduce FedAVO, a novel FL algorithm that enhances communication effectiveness.
We show that FedAVO achieves significant improvement in terms of model accuracy and communication round.
arXiv Detail & Related papers (2023-05-02T02:04:19Z) - Federated Hypergradient Descent [0.0]
We apply a principled approach on a method for adaptive client learning rate, number of local steps, and batch size.
In our federated learning applications, our primary motivations are minimizing communication budget as well as local computational resources in the training pipeline.
We show our numerical results through extensive empirical experiments with the Federated EMNIST-62 (FEMNIST) and Federated Stack Overflow (FSO) datasets.
arXiv Detail & Related papers (2022-11-03T19:22:00Z) - FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning
Using Shared Data on the Server [64.94942635929284]
Federated Learning (FL) suffers from two critical challenges, i.e., limited computational resources and low training efficiency.
We propose a novel FL framework, FedDUAP, to exploit the insensitive data on the server and the decentralized data in edge devices.
By integrating the two original techniques together, our proposed FL model, FedDUAP, significantly outperforms baseline approaches in terms of accuracy (up to 4.8% higher), efficiency (up to 2.8 times faster), and computational cost (up to 61.9% smaller)
arXiv Detail & Related papers (2022-04-25T10:00:00Z) - Accelerating Federated Learning with a Global Biased Optimiser [16.69005478209394]
Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices.
We propose a novel, generalised approach for applying adaptive optimisation techniques to FL with the Federated Global Biased Optimiser (FedGBO) algorithm.
FedGBO accelerates FL by applying a set of global biased optimiser values during the local training phase of FL, which helps to reduce client-drift' from non-IID data.
arXiv Detail & Related papers (2021-08-20T12:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.