A Survey on Unlearning in Large Language Models
- URL: http://arxiv.org/abs/2510.25117v1
- Date: Wed, 29 Oct 2025 02:34:17 GMT
- Title: A Survey on Unlearning in Large Language Models
- Authors: Ruichen Qiu, Jiajun Tan, Jiayue Pu, Honglin Wang, Xiao-Shan Gao, Fei Sun,
- Abstract summary: Large Language Models (LLMs) have revolutionized natural language processing, yet their training on massive corpora poses significant risks.<n>To mitigate these issues and align with legal and ethical standards such as the "right to be forgotten", machine unlearning has emerged as a critical technique.<n>This survey provides a systematic review of over 180 papers on LLM unlearning published since 2021, focusing exclusively on large-scale generative models.
- Score: 18.262778815699345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advancement of Large Language Models (LLMs) has revolutionized natural language processing, yet their training on massive corpora poses significant risks, including the memorization of sensitive personal data, copyrighted material, and knowledge that could facilitate malicious activities. To mitigate these issues and align with legal and ethical standards such as the "right to be forgotten", machine unlearning has emerged as a critical technique to selectively erase specific knowledge from LLMs without compromising their overall performance. This survey provides a systematic review of over 180 papers on LLM unlearning published since 2021, focusing exclusively on large-scale generative models. Distinct from prior surveys, we introduce novel taxonomies for both unlearning methods and evaluations. We clearly categorize methods into training-time, post-training, and inference-time based on the training stage at which unlearning is applied. For evaluations, we not only systematically compile existing datasets and metrics but also critically analyze their advantages, disadvantages, and applicability, providing practical guidance to the research community. In addition, we discuss key challenges and promising future research directions. Our comprehensive overview aims to inform and guide the ongoing development of secure and reliable LLMs.
Related papers
- Unlearning in LLMs: Methods, Evaluation, and Open Challenges [7.530890774798437]
Machine unlearning has emerged as a promising paradigm for selectively removing knowledge or data from trained models without full retraining.<n>This paper aims to serve as a roadmap for developing reliable and responsible unlearning techniques in large language models.
arXiv Detail & Related papers (2026-01-19T17:58:26Z) - Does Machine Unlearning Truly Remove Knowledge? [80.83986295685128]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z) - LLM Post-Training: A Deep Dive into Reasoning Large Language Models [131.10969986056]
Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications.<n>Post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations.
arXiv Detail & Related papers (2025-02-28T18:59:54Z) - A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models [35.893819613585315]
This study investigates the machine unlearning techniques within the context of large language models (LLMs)<n>LLMs unlearning offers a principled approach to removing the influence of undesirable data from LLMs.<n>Despite growing research interest, there is no comprehensive survey that systematically organizes existing work and distills key insights.
arXiv Detail & Related papers (2025-02-22T12:46:14Z) - Recent Advances in Federated Learning Driven Large Language Models: A Survey on Architecture, Performance, and Security [24.969739515876515]
Federated Learning (FL) offers a promising paradigm for training Large Language Models (LLMs) in a decentralized manner while preserving data privacy and minimizing communication overhead.<n>We review a range of strategies enabling unlearning in federated LLMs, including perturbation-based methods, model decomposition, and incremental retraining.<n>This survey identifies critical research directions toward developing secure, adaptable, and high-performing federated LLM systems for real-world deployment.
arXiv Detail & Related papers (2024-06-14T08:40:58Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - Machine Unlearning for Traditional Models and Large Language Models: A Short Survey [11.539080008361662]
Machine unlearning aims to delete data and reduce its impact on models according to user requests.
This paper categorizes and investigates unlearning on both traditional models and Large Language Models (LLMs)
arXiv Detail & Related papers (2024-04-01T16:08:18Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Rethinking Machine Unlearning for Large Language Models [85.92660644100582]
We explore machine unlearning in the domain of large language models (LLMs)<n>This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities.
arXiv Detail & Related papers (2024-02-13T20:51:58Z) - Continual Learning for Large Language Models: A Survey [95.79977915131145]
Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale.
This paper surveys recent works on continual learning for LLMs.
arXiv Detail & Related papers (2024-02-02T12:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.