SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs
- URL: http://arxiv.org/abs/2602.13529v1
- Date: Fri, 13 Feb 2026 23:53:32 GMT
- Title: SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs
- Authors: Mohamed Shaaban, Mohamed Elmahallawy,
- Abstract summary: Federated learning (FL) enables collaborative training across organizational silos without sharing raw data, making it attractive for privacy-sensitive applications.<n>With the rapid adoption of large language models (LLMs), federated fine-tuning of generative LLMs has gained attention as a way to leverage distributed data while preserving confidentiality.<n>We propose SecureGate, a privacy-aware federated fine-tuning framework for LLMs that provides fine-grained privacy control without sacrificing utility.
- Score: 0.4720681957139135
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Federated learning (FL) enables collaborative training across organizational silos without sharing raw data, making it attractive for privacy-sensitive applications. With the rapid adoption of large language models (LLMs), federated fine-tuning of generative LLMs has gained attention as a way to leverage distributed data while preserving confidentiality. However, this setting introduces fundamental challenges: (i) privacy leakage of personally identifiable information (PII) due to LLM memorization, and (ii) a persistent tension between global generalization and local utility under heterogeneous data. Existing defenses, such as data sanitization and differential privacy, reduce leakage but often degrade downstream performance. We propose SecureGate, a privacy-aware federated fine-tuning framework for LLMs that provides fine-grained privacy control without sacrificing utility. SecureGate employs a dual-adapter LoRA architecture: a secure adapter that learns sanitized, globally shareable representations, and a revealing adapter that captures sensitive, organization-specific knowledge. A token-controlled gating module selectively activates these adapters at inference time, enabling controlled information disclosure without retraining. Extensive experiments across multiple LLMs and real-world datasets show that SecureGate improves task utility while substantially reducing PII leakage, achieving up to a 31.66X reduction in inference attack accuracy and a 17.07X reduction in extraction recall for unauthorized requests. Additionally, it maintains 100% routing reliability to the correct adapter and incurs only minimal computational and communication overhead.
Related papers
- Subgraph Federated Learning via Spectral Methods [52.40322201034717]
FedLap is a novel framework that captures inter-node dependencies while ensuring privacy and scalability.<n>We provide a formal analysis of the privacy of FedLap, demonstrating that it preserves privacy.
arXiv Detail & Related papers (2025-10-29T16:22:32Z) - Enterprise AI Must Enforce Participant-Aware Access Control [9.68210477539956]
Large language models (LLMs) are increasingly deployed in enterprise settings where they interact with multiple users and are trained or fine-tuned on sensitive internal data.<n>We show that adversaries can exploit current fine-tuning and RAG architectures to leak sensitive information by leveraging the lack of access control enforcement.<n>We introduce a framework centered on the principle that any content used in training, retrieval, or generation by an LLM is explicitly authorized for emphall users involved in the interaction.
arXiv Detail & Related papers (2025-09-18T04:30:49Z) - Closer to Reality: Practical Semi-Supervised Federated Learning for Foundation Model Adaptation [56.36237936346563]
Foundation models (FMs) exhibit remarkable generalization but require adaptation to downstream tasks.<n>Due to data privacy regulations, cloud-based FMs cannot directly access private edge data.<n>We introduce Practical Semi-Supervised Federated Learning (PSSFL), where edge devices hold only unlabeled, low-resolution data.<n>Our work paves the way for scalable and privacy-preserving FM adaptation in federated scenarios.
arXiv Detail & Related papers (2025-08-22T17:47:02Z) - Differentially Private Federated Low Rank Adaptation Beyond Fixed-Matrix [15.815684304898575]
Large language models (LLMs) typically require fine-tuning for domain-specific tasks, and LoRA offers a computationally efficient approach by training low-rank adapters.<n>Applying differential privacy (DP) to federated LoRA encounters a dilemma: adding noise to both adapters amplifies synthetic noise on the model, while fixing one adapter impairs the learnability of fine-tuning.<n>We propose FedASK, a novel federated LoRA framework to enable effective updating of both low-rank adapters with robust differential privacy.
arXiv Detail & Related papers (2025-07-14T07:17:24Z) - Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission [87.68447072141402]
Hybrid Language Models (HLMs) combine the low-latency efficiency of Small Language Models (SLMs) on edge devices with the high accuracy of Large Language Models (LLMs) on centralized servers.<n>We propose FedHLM, a communication-efficient HLM framework that integrates uncertainty-aware inference with Federated Learning (FL)
arXiv Detail & Related papers (2025-06-30T02:56:11Z) - FedShield-LLM: A Secure and Scalable Federated Fine-Tuned Large Language Model [0.48342038441006796]
Federated Learning (FL) offers a decentralized framework for training and fine-tuning Large Language Models (LLMs)<n>FL addresses privacy and security concerns while navigating challenges associated with the substantial computational demands of LLMs.<n>We propose a novel method, FedShield-LLM, that uses pruning with Fully Homomorphic Encryption (FHE) for Low-Rank Adaptation (LoRA) parameters.
arXiv Detail & Related papers (2025-06-06T00:05:05Z) - FedSEA-LLaMA: A Secure, Efficient and Adaptive Federated Splitting Framework for Large Language Models [13.304846508027588]
We introduce FedSEA-LLaMA, a secure, efficient, and Adaptive Federated splitting framework based on LLaMA2.<n>We employ attention-mask compression and KV cache collaboration to reduce communication costs, accelerating training and inference.<n>Experiments on natural language understanding, summarization, and conversational QA tasks show that FedSEA-LLaMA maintains performance comparable to centralized LLaMA2.
arXiv Detail & Related papers (2025-05-21T15:58:08Z) - Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG)<n>FedE4RAG facilitates collaborative training of client-side RAG retrieval models.<n>We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z) - Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization [61.02719787737867]
Large language models (LLMs) are increasingly deployed and democratized on edge devices.<n>One promising solution is uncertainty-based SLM routing, offloading high-stakes queries to stronger LLMs when resulting in low-confidence responses on SLM.<n>We conduct a comprehensive investigation into benchmarking and generalization of uncertainty-driven routing strategies from SLMs to LLMs over 1500+ settings.
arXiv Detail & Related papers (2025-02-06T18:59:11Z) - Personalized Wireless Federated Learning for Large Language Models [75.22457544349668]
Large language models (LLMs) have driven profound transformations in wireless networks.<n>Within wireless environments, the training of LLMs faces significant challenges related to security and privacy.<n>This paper presents a systematic analysis of the training stages of LLMs in wireless networks, including pre-training, instruction tuning, and alignment tuning.
arXiv Detail & Related papers (2024-04-20T02:30:21Z) - Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning [32.52811740662061]
This article introduces DP-LoRA, a novel federated learning algorithm tailored for large language models (LLMs)
DP-LoRA preserves data privacy by employing a Gaussian mechanism that adds noise in weight updates, maintaining individual data privacy while facilitating collaborative model training.
arXiv Detail & Related papers (2023-12-29T06:50:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.