Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization
- URL: http://arxiv.org/abs/2310.15080v3
- Date: Sun, 11 Feb 2024 11:59:52 GMT
- Title: Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization
- Authors: Tianshi Che, Ji Liu, Yang Zhou, Jiaxiang Ren, Jiwen Zhou, Victor S.
Sheng, Huaiyu Dai, Dejing Dou
- Abstract summary: Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data.
The training process of Large Language Models (LLMs) generally incurs the update of significant parameters.
This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
- Score: 71.87335804334616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a promising paradigm to enable collaborative model
training with decentralized data. However, the training process of Large
Language Models (LLMs) generally incurs the update of significant parameters,
which limits the applicability of FL techniques to tackle the LLMs in real
scenarios. Prompt tuning can significantly reduce the number of parameters to
update, but it either incurs performance degradation or low training
efficiency. The straightforward utilization of prompt tuning in the FL often
raises non-trivial communication costs and dramatically degrades performance.
In addition, the decentralized data is generally non-Independent and
Identically Distributed (non-IID), which brings client drift problems and thus
poor performance. This paper proposes a Parameter-efficient prompt Tuning
approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and
effective FL of LLMs. First, an efficient partial prompt tuning approach is
proposed to improve performance and efficiency simultaneously. Second, a novel
adaptive optimization method is developed to address the client drift problems
on both the device and server sides to enhance performance further. Extensive
experiments based on 10 datasets demonstrate the superb performance (up to
60.8\% in terms of accuracy) and efficiency (up to 97.59\% in terms of training
time) of FedPepTAO compared with 9 baseline approaches. Our code is available
at https://github.com/llm-eff/FedPepTAO.
Related papers
- Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System [75.25394449773052]
Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving.
Yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods.
We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness.
arXiv Detail & Related papers (2024-10-10T17:00:06Z) - Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models [43.26028399395612]
We propose a Fisher Information-based Efficient Curriculum Federated Learning framework (FibecFed) with two novel methods.
First, we propose a fisher information-based method to adaptively sample data within each device to improve the effectiveness of the FL fine-tuning process.
Second, we dynamically select the proper layers for global aggregation and sparse parameters for local update with LoRA.
arXiv Detail & Related papers (2024-09-30T18:12:18Z) - Efficient Federated Learning Using Dynamic Update and Adaptive Pruning with Momentum on Shared Server Data [59.6985168241067]
Federated Learning (FL) encounters two important problems, i.e., low training efficiency and limited computational resources.
We propose a new FL framework, FedDUMAP, to leverage the shared insensitive data on the server and the distributed data in edge devices.
Our proposed FL model, FedDUMAP, combines the three original techniques and has a significantly better performance compared with baseline approaches.
arXiv Detail & Related papers (2024-08-11T02:59:11Z) - SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead.
Experiments show that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines.
arXiv Detail & Related papers (2024-06-01T13:10:35Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z) - FedAVO: Improving Communication Efficiency in Federated Learning with
African Vultures Optimizer [0.0]
Federated Learning (FL) is a distributed machine learning technique.
In this paper, we introduce FedAVO, a novel FL algorithm that enhances communication effectiveness.
We show that FedAVO achieves significant improvement in terms of model accuracy and communication round.
arXiv Detail & Related papers (2023-05-02T02:04:19Z) - FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning
Using Shared Data on the Server [64.94942635929284]
Federated Learning (FL) suffers from two critical challenges, i.e., limited computational resources and low training efficiency.
We propose a novel FL framework, FedDUAP, to exploit the insensitive data on the server and the decentralized data in edge devices.
By integrating the two original techniques together, our proposed FL model, FedDUAP, significantly outperforms baseline approaches in terms of accuracy (up to 4.8% higher), efficiency (up to 2.8 times faster), and computational cost (up to 61.9% smaller)
arXiv Detail & Related papers (2022-04-25T10:00:00Z) - Accelerating Federated Learning with a Global Biased Optimiser [16.69005478209394]
Federated Learning (FL) is a recent development in the field of machine learning that collaboratively trains models without the training data leaving client devices.
We propose a novel, generalised approach for applying adaptive optimisation techniques to FL with the Federated Global Biased Optimiser (FedGBO) algorithm.
FedGBO accelerates FL by applying a set of global biased optimiser values during the local training phase of FL, which helps to reduce client-drift' from non-IID data.
arXiv Detail & Related papers (2021-08-20T12:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.