Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models
- URL: http://arxiv.org/abs/2508.08875v2
- Date: Sat, 08 Nov 2025 06:01:37 GMT
- Title: Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models
- Authors: Fuyao Zhang, Xinyu Yan, Tiantong Wu, Wenjie Li, Tianxiang Chen, Yang Cao, Ran Yan, Longtao Huang, Wei Yang Bryan Lim, Qiang Yang,
- Abstract summary: Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning.<n>We introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data.
- Score: 32.059157903736754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.
Related papers
- Federated Attention: A Distributed Paradigm for Collaborative LLM Inference over Edge Networks [63.541114376141735]
Large language models (LLMs) are proliferating rapidly at the edge, delivering intelligent capabilities across diverse application scenarios.<n>However, their practical deployment in collaborative scenarios confronts fundamental challenges: privacy vulnerabilities, communication overhead, and computational bottlenecks.<n>We propose Federated Attention (FedAttn), which integrates the federated paradigm into the self-attention mechanism.
arXiv Detail & Related papers (2025-11-04T15:14:58Z) - Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission [87.68447072141402]
Hybrid Language Models (HLMs) combine the low-latency efficiency of Small Language Models (SLMs) on edge devices with the high accuracy of Large Language Models (LLMs) on centralized servers.<n>We propose FedHLM, a communication-efficient HLM framework that integrates uncertainty-aware inference with Federated Learning (FL)
arXiv Detail & Related papers (2025-06-30T02:56:11Z) - Federated Fine-Tuning of LLMs: Framework Comparison and Research Directions [59.5243730853157]
Federated learning (FL) provides a privacy-preserving solution for fine-tuning pre-trained large language models (LLMs) using distributed private datasets.<n>This article conducts a comparative analysis of three advanced federated LLM (FedLLM) frameworks that integrate knowledge distillation (KD) and split learning (SL) to mitigate these issues.
arXiv Detail & Related papers (2025-01-08T11:37:06Z) - Federated In-Context LLM Agent Learning [3.4757641432843487]
Large Language Models (LLMs) have revolutionized intelligent services by enabling logical reasoning, tool use, and interaction with external systems as agents.<n>In this paper, we propose a novel privacy-preserving Federated In-context LLM Agent Learning (FICAL) algorithm.<n>The results show that FICAL has competitive performance compared to other SOTA baselines with a significant communication cost decrease of $mathbf3.33times105$ times.
arXiv Detail & Related papers (2024-12-11T03:00:24Z) - eFedLLM: Efficient LLM Inference Based on Federated Learning [1.6179784294541053]
Large Language Models (LLMs) herald a transformative era in artificial intelligence (AI)
This paper introduces an effective approach that enhances the operational efficiency and affordability of LLM inference.
arXiv Detail & Related papers (2024-11-24T22:50:02Z) - A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs [74.35290684163718]
A primary challenge in large language model (LLM) development is their onerous pre-training cost.
This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by leveraging a small language model (SLM)
arXiv Detail & Related papers (2024-10-24T14:31:52Z) - Personalized Wireless Federated Learning for Large Language Models [75.22457544349668]
Large language models (LLMs) have driven profound transformations in wireless networks.<n>Within wireless environments, the training of LLMs faces significant challenges related to security and privacy.<n>This paper presents a systematic analysis of the training stages of LLMs in wireless networks, including pre-training, instruction tuning, and alignment tuning.
arXiv Detail & Related papers (2024-04-20T02:30:21Z) - Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning [32.52811740662061]
This article introduces DP-LoRA, a novel federated learning algorithm tailored for large language models (LLMs)
DP-LoRA preserves data privacy by employing a Gaussian mechanism that adds noise in weight updates, maintaining individual data privacy while facilitating collaborative model training.
arXiv Detail & Related papers (2023-12-29T06:50:38Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z) - From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets.
Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.