Related papers: A Fast, Performant, Secure Distributed Training Framework For Large Language Model

A Fast, Performant, Secure Distributed Training Framework For Large Language Model

URL: http://arxiv.org/abs/2401.09796v2
Date: Fri, 19 Jan 2024 15:09:45 GMT
Title: A Fast, Performant, Secure Distributed Training Framework For Large Language Model
Authors: Wei Huang, Yinggui Wang, Anda Cheng, Aihui Zhou, Chaofan Yu, Lei Wang
Abstract summary: We propose a secure distributed LLM based on model slicing. We deploy the Trusted Execution Environment (TEE) on both the client and server side. Secure communication is executed in the TEE and general environments through lightweight encryption.
Score: 8.547104574876887
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The distributed (federated) LLM is an important method for co-training the domain-specific LLM using siloed data. However, maliciously stealing model parameters and data from the server or client side has become an urgent problem to be solved. In this paper, we propose a secure distributed LLM based on model slicing. In this case, we deploy the Trusted Execution Environment (TEE) on both the client and server side, and put the fine-tuned structure (LoRA or embedding of P-tuning v2) into the TEE. Then, secure communication is executed in the TEE and general environments through lightweight encryption. In order to further reduce the equipment cost as well as increase the model performance and accuracy, we propose a split fine-tuning scheme. In particular, we split the LLM by layers and place the latter layers in a server-side TEE (the client does not need a TEE). We then combine the proposed Sparsification Parameter Fine-tuning (SPF) with the LoRA part to improve the accuracy of the downstream task. Numerous experiments have shown that our method guarantees accuracy while maintaining security.

Related papers

FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates [58.18162789618869]
Federated Learning (FL) is a widely used framework for training models in a decentralized manner. We propose the FedRand framework, which avoids disclosing the full set of client parameters. We empirically validate that FedRand improves robustness against MIAs compared to relevant baselines.
arXiv Detail & Related papers (2025-03-10T11:55:50Z)
Practical Secure Inference Algorithm for Fine-tuned Large Language Model Based on Fully Homomorphic Encryption [0.0]
We combine Fully Homomorphic Encryption(FHE) and provable security theory with Fine-Tuning(PEFT) to propose an efficient and secure inference scheme for large language models. In this paper, we use the open-source model ChatGLM2-6B as the base model which is fine-tuned by LoRA. Experimental results show the inference efficiency of our scheme reaches 1.61s/ which displays that the scheme has good practicality.
arXiv Detail & Related papers (2025-01-03T07:19:23Z)
PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning [8.61459170031022]
This paper introduces a novel security threat to FedPEFT, termed PEFT-as-an-Attack (PaaA) Our evaluation of PaaA reveals that with less than 1% of the model's parameters set as trainable, and a small subset of clients acting maliciously, the attack achieves an approximate 80% attack success rate using representative PEFT methods such as LoRA. Our results underscore the urgent need for more effective defense mechanisms that simultaneously ensure security and maintain the performance of the FedPEFT paradigm.
arXiv Detail & Related papers (2024-11-28T19:05:01Z)
Federated LLMs Fine-tuned with Adaptive Importance-Aware LoRA [24.871424801066006]
Federated fine-tuning of Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving data privacy. We propose a novel Heterogeneous Adaptive Federated Low-Rank Adaptation (LoRA) fine-tuned LLM framework (HAFL) Our method converges quickly with low communication size, and avoids performance degradation when distributing models to clients.
arXiv Detail & Related papers (2024-11-10T19:59:54Z)
Mixture of Attentions For Speculative Decoding [17.344416130742232]
Speculative decoding (SD) leverages smaller models to efficiently propose future tokens, which are then verified by the Large Language Models in parallel. We identify several limitations of SD models including the lack of on-policyness during training and partial observability. We propose a more grounded architecture for small models by introducing a Mixture of Attentions for SD.
arXiv Detail & Related papers (2024-10-04T10:25:52Z)
FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model [48.33280660752336]
Large language models (LLMs) show amazing performance on many domain-specific tasks after fine-tuning with some appropriate data. Many domain-specific data are privately distributed across multiple owners. We introduce FedBiOT, a resource-efficient LLM fine-tuning approach to federated learning.
arXiv Detail & Related papers (2024-06-25T16:45:47Z)
Safely Learning with Private Data: A Federated Learning Framework for Large Language Model [3.1077263218029105]
Federated learning (FL) is an ideal solution for training models with distributed private data. Traditional frameworks like FedAvg are unsuitable for large language models (LLM) We propose FL-GLM, which prevents data leakage caused by both server-side and peer-client attacks.
arXiv Detail & Related papers (2024-06-21T06:43:15Z)
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models [51.20476412037321]
Fine-tuning large language models (LLMs) is necessary to enhance their performance for customized datasets, domain-specific tasks, or other private needs. Safe LoRA is a one-liner patch to the original LoRA implementation by introducing the projection of LoRA weights from selected layers to the safety-aligned subspace. Our experiments demonstrate that when fine-tuning on purely malicious data, Safe LoRA retains similar safety performance as the original aligned model.
arXiv Detail & Related papers (2024-05-27T05:04:05Z)
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans. We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems. Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z)
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions. FedKSeed employs zeroth-order optimization with a finite set of random seeds. It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z)
LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)
Subspace based Federated Unlearning [75.90552823500633]
Federated unlearning (FL) aims to remove a specified target client's contribution in FL to satisfy the user's right to be forgotten. Most existing federated unlearning algorithms require the server to store the history of the parameter updates. We propose a simple-yet-effective subspace based federated unlearning method, dubbed SFU, that lets the global model perform gradient ascent.
arXiv Detail & Related papers (2023-02-24T04:29:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.