Related papers: PriFFT: Privacy-preserving Federated Fine-tuning of Large Language Models via Function Secret Sharing

PriFFT: Privacy-preserving Federated Fine-tuning of Large Language Models via Function Secret Sharing

URL: http://arxiv.org/abs/2503.03146v1
Date: Wed, 05 Mar 2025 03:41:57 GMT
Title: PriFFT: Privacy-preserving Federated Fine-tuning of Large Language Models via Function Secret Sharing
Authors: Zhichao You, Xuewen Dong, Ke Cheng, Xutong Mu, Jiaxuan Fu, Shiyang Ma, Qiang Qu, Yulong Shen,
Abstract summary: Fine-tuning large language models (LLMs) raises privacy concerns due to the risk of exposing sensitive training data.<n>Recent studies show that adversaries can still infer private information from model updates in FL.<n>We propose PriFFT, a privacy-preserving federated fine-tuning mechanism.
Score: 20.148411915688175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fine-tuning large language models (LLMs) raises privacy concerns due to the risk of exposing sensitive training data. Federated learning (FL) mitigates this risk by keeping training samples on local devices, but recent studies show that adversaries can still infer private information from model updates in FL. Additionally, LLM parameters are typically shared publicly during federated fine-tuning, while developers are often reluctant to disclose these parameters, posing further security challenges. Inspired by the above problems, we propose PriFFT, a privacy-preserving federated fine-tuning mechanism, to protect both the model updates and parameters. In PriFFT, clients and the server share model inputs and parameters by secret sharing, performing secure fine-tuning on shared values without accessing plaintext data. Due to considerable LLM parameters, privacy-preserving federated fine-tuning invokes complex secure calculations and requires substantial communication and computation resources. To optimize the efficiency of privacy-preserving federated fine-tuning of LLMs, we introduce function secret-sharing protocols for various operations, including reciprocal calculation, tensor products, natural exponentiation, softmax, hyperbolic tangent, and dropout. The proposed protocols achieve up to 4.02X speed improvement and reduce 7.19X communication overhead compared to the implementation based on existing secret sharing methods. Besides, PriFFT achieves a 2.23X speed improvement and reduces 4.08X communication overhead in privacy-preserving fine-tuning without accuracy drop compared to the existing secret sharing methods.

Related papers

Subgraph Federated Learning via Spectral Methods [52.40322201034717]
FedLap is a novel framework that captures inter-node dependencies while ensuring privacy and scalability.<n>We provide a formal analysis of the privacy of FedLap, demonstrating that it preserves privacy.
arXiv Detail & Related papers (2025-10-29T16:22:32Z)
Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission [87.68447072141402]
Hybrid Language Models (HLMs) combine the low-latency efficiency of Small Language Models (SLMs) on edge devices with the high accuracy of Large Language Models (LLMs) on centralized servers.<n>We propose FedHLM, a communication-efficient HLM framework that integrates uncertainty-aware inference with Federated Learning (FL)
arXiv Detail & Related papers (2025-06-30T02:56:11Z)
Efficient Full-Stack Private Federated Deep Learning with Post-Quantum Security [17.45950557331482]
Federated learning (FL) enables collaborative model training while preserving user data privacy by keeping data local.<n>Despite these advantages, FL remains vulnerable to privacy attacks on user updates and model parameters during training and deployment.<n>We introduce Beskar, a novel framework that provides post-quantum secure aggregation.
arXiv Detail & Related papers (2025-05-09T03:20:48Z)
FedRE: Robust and Effective Federated Learning with Privacy Preference [20.969342596181246]
Federated Learning (FL) employs gradient aggregation at the server for distributed training to prevent the privacy leakage of raw data.<n>Private information can still be divulged through the analysis of uploaded gradients from clients.<n>Existing methods fail to take practical issues into account by merely perturbing each sample with the same mechanism.
arXiv Detail & Related papers (2025-05-08T01:50:27Z)
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates [58.18162789618869]
Federated Learning (FL) is a widely used framework for training models in a decentralized manner. We propose the FedRand framework, which avoids disclosing the full set of client parameters. We empirically validate that FedRand improves robustness against MIAs compared to relevant baselines.
arXiv Detail & Related papers (2025-03-10T11:55:50Z)
FedEM: A Privacy-Preserving Framework for Concurrent Utility Preservation in Federated Learning [17.853502904387376]
Federated Learning (FL) enables collaborative training of models across distributed clients without sharing local data, addressing privacy concerns in decentralized systems.<n>We propose Federated Error Minimization (FedEM), a novel algorithm that incorporates controlled perturbations through adaptive noise injection.<n> Experimental results on benchmark datasets demonstrate that FedEM significantly reduces privacy risks and preserves model accuracy, achieving a robust balance between privacy protection and utility preservation.
arXiv Detail & Related papers (2025-03-08T02:48:00Z)
A New Federated Learning Framework Against Gradient Inversion Attacks [17.3044168511991]
Federated Learning (FL) aims to protect data privacy by enabling clients to collectively train machine learning models without sharing their raw data.<n>Recent studies demonstrate that information exchanged during FL is subject to Gradient Inversion Attacks (GIA)
arXiv Detail & Related papers (2024-12-10T04:53:42Z)
Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z)
Camel: Communication-Efficient and Maliciously Secure Federated Learning in the Shuffle Model of Differential Privacy [9.100955087185811]
Federated learning (FL) has rapidly become a compelling paradigm that enables multiple clients to jointly train a model by sharing only gradient updates for aggregation. In order to protect the gradient updates which could also be privacy-sensitive, there has been a line of work studying local differential privacy mechanisms. We present Camel, a new communication-efficient and maliciously secure FL framework in the shuffle model of DP.
arXiv Detail & Related papers (2024-10-04T13:13:44Z)
ACCESS-FL: Agile Communication and Computation for Efficient Secure Aggregation in Stable Federated Learning Networks [26.002975401820887]
Federated Learning (FL) is a distributed learning framework designed for privacy-aware applications. Traditional FL approaches risk exposing sensitive client data when plain model updates are transmitted to the server. Google's Secure Aggregation (SecAgg) protocol addresses this threat by employing a double-masking technique. We propose ACCESS-FL, a communication-and-computation-efficient secure aggregation method.
arXiv Detail & Related papers (2024-09-03T09:03:38Z)
PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data. The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates. We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z)
FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners. FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks. We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z)
Binary Federated Learning with Client-Level Differential Privacy [7.854806519515342]
Federated learning (FL) is a privacy-preserving collaborative learning framework. Existing FL systems typically adopt Federated Average (FedAvg) as the training algorithm. We propose a communication-efficient FL training algorithm with differential privacy guarantee.
arXiv Detail & Related papers (2023-08-07T06:07:04Z)
Theoretically Principled Federated Learning for Balancing Privacy and Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters. It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z)
FedPDD: A Privacy-preserving Double Distillation Framework for Cross-silo Federated Recommendation [4.467445574103374]
Cross-platform recommendation aims to improve recommendation accuracy by gathering heterogeneous features from different platforms. Such cross-silo collaborations between platforms are restricted by increasingly stringent privacy protection regulations. We propose a novel privacy-preserving double distillation framework named FedPDD for cross-silo federated recommendation.
arXiv Detail & Related papers (2023-05-09T16:17:04Z)
Federated Nearest Neighbor Machine Translation [66.8765098651988]
In this paper, we propose a novel federated nearest neighbor (FedNN) machine translation framework. FedNN leverages one-round memorization-based interaction to share knowledge across different clients. Experiments show that FedNN significantly reduces computational and communication costs compared with FedAvg.
arXiv Detail & Related papers (2023-02-23T18:04:07Z)
FedLAP-DP: Federated Learning by Sharing Differentially Private Loss Approximations [53.268801169075836]
We propose FedLAP-DP, a novel privacy-preserving approach for federated learning. A formal privacy analysis demonstrates that FedLAP-DP incurs the same privacy costs as typical gradient-sharing schemes. Our approach presents a faster convergence speed compared to typical gradient-sharing methods.
arXiv Detail & Related papers (2023-02-02T12:56:46Z)
Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server. Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy. In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z)
Stochastic Coded Federated Learning with Convergence and Privacy Guarantees [8.2189389638822]
Federated learning (FL) has attracted much attention as a privacy-preserving distributed machine learning framework. This paper proposes a coded federated learning framework, namely coded federated learning (SCFL) to mitigate the straggler issue. We characterize the privacy guarantee by the mutual information differential privacy (MI-DP) and analyze the convergence performance in federated learning.
arXiv Detail & Related papers (2022-01-25T04:43:29Z)
Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users. An adversary may still be able to infer the private training data by attacking the released model. Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.