A Fast, Performant, Secure Distributed Training Framework For Large
Language Model
- URL: http://arxiv.org/abs/2401.09796v2
- Date: Fri, 19 Jan 2024 15:09:45 GMT
- Title: A Fast, Performant, Secure Distributed Training Framework For Large
Language Model
- Authors: Wei Huang, Yinggui Wang, Anda Cheng, Aihui Zhou, Chaofan Yu, Lei Wang
- Abstract summary: We propose a secure distributed LLM based on model slicing.
We deploy the Trusted Execution Environment (TEE) on both the client and server side.
Secure communication is executed in the TEE and general environments through lightweight encryption.
- Score: 8.547104574876887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The distributed (federated) LLM is an important method for co-training the
domain-specific LLM using siloed data. However, maliciously stealing model
parameters and data from the server or client side has become an urgent problem
to be solved. In this paper, we propose a secure distributed LLM based on model
slicing. In this case, we deploy the Trusted Execution Environment (TEE) on
both the client and server side, and put the fine-tuned structure (LoRA or
embedding of P-tuning v2) into the TEE. Then, secure communication is executed
in the TEE and general environments through lightweight encryption. In order to
further reduce the equipment cost as well as increase the model performance and
accuracy, we propose a split fine-tuning scheme. In particular, we split the
LLM by layers and place the latter layers in a server-side TEE (the client does
not need a TEE). We then combine the proposed Sparsification Parameter
Fine-tuning (SPF) with the LoRA part to improve the accuracy of the downstream
task. Numerous experiments have shown that our method guarantees accuracy while
maintaining security.
Related papers
- Mixture of Attentions For Speculative Decoding [17.344416130742232]
Speculative decoding (SD) leverages smaller models to efficiently propose future tokens, which are then verified by the Large Language Models in parallel.
We identify several limitations of SD models including the lack of on-policyness during training and partial observability.
We propose a more grounded architecture for small models by introducing a Mixture of Attentions for SD.
arXiv Detail & Related papers (2024-10-04T10:25:52Z) - SplitLoRA: A Split Parameter-Efficient Fine-Tuning Framework for Large Language Models [23.1321591734785]
SplitLoRA is built on the split federated learning (SFL) framework, amalgamating the advantages of parallel training from FL and model splitting from SL.
SplitLoRA is the inaugural open-source benchmark for SL LLM fine-tuning.
arXiv Detail & Related papers (2024-07-01T04:13:25Z) - FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model [48.33280660752336]
Large language models (LLMs) show amazing performance on many domain-specific tasks after fine-tuning with some appropriate data.
Many domain-specific data are privately distributed across multiple owners.
We introduce FedBiOT, a resource-efficient LLM fine-tuning approach to federated learning.
arXiv Detail & Related papers (2024-06-25T16:45:47Z) - Safely Learning with Private Data: A Federated Learning Framework for Large Language Model [3.1077263218029105]
Federated learning (FL) is an ideal solution for training models with distributed private data.
Traditional frameworks like FedAvg are unsuitable for large language models (LLM)
We propose FL-GLM, which prevents data leakage caused by both server-side and peer-client attacks.
arXiv Detail & Related papers (2024-06-21T06:43:15Z) - Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models [51.20476412037321]
Fine-tuning large language models (LLMs) is necessary to enhance their performance for customized datasets, domain-specific tasks, or other private needs.
Safe LoRA is a one-liner patch to the original LoRA implementation by introducing the projection of LoRA weights from selected layers to the safety-aligned subspace.
Our experiments demonstrate that when fine-tuning on purely malicious data, Safe LoRA retains similar safety performance as the original aligned model.
arXiv Detail & Related papers (2024-05-27T05:04:05Z) - Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems.
Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z) - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions.
FedKSeed employs zeroth-order optimization with a finite set of random seeds.
It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - Subspace based Federated Unlearning [75.90552823500633]
Federated unlearning (FL) aims to remove a specified target client's contribution in FL to satisfy the user's right to be forgotten.
Most existing federated unlearning algorithms require the server to store the history of the parameter updates.
We propose a simple-yet-effective subspace based federated unlearning method, dubbed SFU, that lets the global model perform gradient ascent.
arXiv Detail & Related papers (2023-02-24T04:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.