FwdLLM: Efficient FedLLM using Forward Gradient
- URL: http://arxiv.org/abs/2308.13894v2
- Date: Sat, 20 Jan 2024 09:24:33 GMT
- Title: FwdLLM: Efficient FedLLM using Forward Gradient
- Authors: Mengwei Xu, Dongqi Cai, Yaozong Wu, Xiang Li, Shangguang Wang
- Abstract summary: This work introduces FwdLLM, an innovative FL protocol designed to enhance the FedLLM efficiency.
FwdLLM employs backpropagation (BP)-free training methods, requiring devices only to execute perturbed inferences''
- Score: 8.520892692833293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) are transforming the landscape of mobile
intelligence. Federated Learning (FL), a method to preserve user data privacy,
is often employed in fine-tuning LLMs to downstream mobile tasks, an approach
known as FedLLM. Though recent efforts have addressed the network issue induced
by the vast model size, they have not practically mitigated vital challenges
concerning integration with mobile devices, such as significant memory
consumption and sluggish model convergence.
In response to these challenges, this work introduces FwdLLM, an innovative
FL protocol designed to enhance the FedLLM efficiency. The key idea of FwdLLM
to employ backpropagation (BP)-free training methods, requiring devices only to
execute ``perturbed inferences''. Consequently, FwdLLM delivers way better
memory efficiency and time efficiency (expedited by mobile NPUs and an expanded
array of participant devices). FwdLLM centers around three key designs: (1) it
combines BP-free training with parameter-efficient training methods, an
essential way to scale the approach to the LLM era; (2) it systematically and
adaptively allocates computational loads across devices, striking a careful
balance between convergence speed and accuracy; (3) it discriminatively samples
perturbed predictions that are more valuable to model convergence.
Comprehensive experiments with five LLMs and three NLP tasks illustrate
FwdLLM's significant advantages over conventional methods, including up to
three orders of magnitude faster convergence and a 14.6x reduction in memory
footprint. Uniquely, FwdLLM paves the way for federated learning of
billion-parameter LLMs such as LLaMA on COTS mobile devices -- a feat
previously unattained.
Related papers
- Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models.
Our approach employs activation sparsity to extract experts.
Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z) - FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices [10.01451891927236]
textbfFederated textbfProxy-textbfTuning (FedPT) is a novel framework for federated fine-tuning of black-box large LMs.
FedPT can significantly reduce computation, communication, and memory overhead while maintaining competitive performance.
arXiv Detail & Related papers (2024-10-01T03:20:39Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML)
A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming.
This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding.
A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z) - Save It All: Enabling Full Parameter Tuning for Federated Large Language Models via Cycle Block Gradient Descent [15.463595798992621]
Large language models (LLMs) have revolutionized the deep learning paradigm, yielding impressive results across a wide array of tasks.
Existing solutions make the unrealistic assumption that the entire model is exchanged for training.
We introduce a novel method for the efficient training and fine-tuning of LLMs in FL, with minimal resource consumption.
arXiv Detail & Related papers (2024-06-17T03:49:44Z) - MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development.
This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z) - EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism [70.07661254213181]
We present EE-LLM, a framework for large-scale training and inference of early-exit large language models (LLMs)
Built upon Megatron-LM, EE-LLM implements a variety of algorithmic innovations and performance optimizations tailored to early exiting.
Our analytical and empirical study shows that EE-LLM achieves great training efficiency with negligible computational overhead.
arXiv Detail & Related papers (2023-12-08T09:31:50Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.