FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices
- URL: http://arxiv.org/abs/2410.00362v1
- Date: Tue, 1 Oct 2024 03:20:39 GMT
- Title: FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices
- Authors: Zhidong Gao, Yu Zhang, Zhenxiao Zhang, Yanmin Gong, Yuanxiong Guo,
- Abstract summary: textbfFederated textbfProxy-textbfTuning (FedPT) is a novel framework for federated fine-tuning of black-box large LMs.
FedPT can significantly reduce computation, communication, and memory overhead while maintaining competitive performance.
- Score: 10.01451891927236
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite demonstrating superior performance across a variety of linguistic tasks, pre-trained large language models (LMs) often require fine-tuning on specific datasets to effectively address different downstream tasks. However, fine-tuning these LMs for downstream tasks necessitates collecting data from individuals, which raises significant privacy concerns. Federated learning (FL) has emerged as the de facto solution, enabling collaborative model training without sharing raw data. While promising, federated fine-tuning of large LMs faces significant challenges, including restricted access to model parameters and high computation, communication, and memory overhead. To address these challenges, this paper introduces \textbf{Fed}erated \textbf{P}roxy-\textbf{T}uning (FedPT), a novel framework for federated fine-tuning of black-box large LMs, requiring access only to their predictions over the output vocabulary instead of their parameters. Specifically, devices in FedPT first collaboratively tune a smaller LM, and then the server combines the knowledge learned by the tuned small LM with the knowledge learned by the larger pre-trained LM to construct a large proxy-tuned LM that can reach the performance of directly tuned large LMs. The experimental results demonstrate that FedPT can significantly reduce computation, communication, and memory overhead while maintaining competitive performance compared to directly federated fine-tuning of large LMs. FedPT offers a promising solution for efficient, privacy-preserving fine-tuning of large LMs on resource-constrained devices, broadening the accessibility and applicability of state-of-the-art large LMs.
Related papers
- FedDTPT: Federated Discrete and Transferable Prompt Tuning for Black-Box Large Language Models [14.719919025265224]
Fine-tuning large language models (LLMs) with data from specific scenarios poses privacy leakage risks.
We propose for the first time a federated discrete and transferable prompt tuning, namely FedDTPT, for black-box large language models.
Our approach achieves higher accuracy, reduced communication overhead, and robustness to non-iid data in a black-box setting.
arXiv Detail & Related papers (2024-11-01T19:19:23Z) - Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models [24.07770417615704]
We introduce FedTT and FedTT+, methods for adapting Large Language Models.
FedTT is versatile and can be applied to both cross-silo FL and large-scale cross-device FL.
Our proposed methods successfully address data heterogeneity challenges and perform on par or even better than existing federated PEFT approaches.
arXiv Detail & Related papers (2024-10-16T23:50:39Z) - R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML)
A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming.
This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding.
A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z) - Reliable, Adaptable, and Attributable Language Models with Retrieval [144.26890121729514]
Parametric language models (LMs) are trained on vast amounts of web data.
They face practical challenges such as hallucinations, difficulty in adapting to new data distributions, and a lack of verifiability.
We advocate for retrieval-augmented LMs to replace parametric LMs as the next generation of LMs.
arXiv Detail & Related papers (2024-03-05T18:22:33Z) - MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development.
This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z) - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions.
FedKSeed employs zeroth-order optimization with a finite set of random seeds.
It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z) - FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language
Models [22.29061931122386]
Pre-trained language models (PLM) have revolutionized the NLP landscape, achieving stellar performances across diverse tasks.
This paper introduces Federated Black-box Prompt Tuning (FedBPT), a framework designed to address these challenges.
arXiv Detail & Related papers (2023-10-02T16:43:14Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.