LLM Unlearning Without an Expert Curated Dataset
- URL: http://arxiv.org/abs/2508.06595v3
- Date: Tue, 07 Oct 2025 03:52:31 GMT
- Title: LLM Unlearning Without an Expert Curated Dataset
- Authors: Xiaoyuan Zhu, Muru Zhang, Ollie Liu, Robin Jia, Willie Neiswanger,
- Abstract summary: We introduce a scalable, automated approach to generate high-quality forget sets using language models themselves.<n>Our method synthesizes textbook-style data through a structured prompting pipeline, requiring only a domain name as input.<n>We show that our synthetic datasets consistently outperform the baseline synthetic alternatives and are comparable to the expert-curated ones.
- Score: 37.23853289691218
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern large language models often encode sensitive, harmful, or copyrighted knowledge, raising the need for post-hoc unlearning-the ability to remove specific domains of knowledge from a model without full retraining. A major bottleneck in current unlearning pipelines is constructing effective forget sets-datasets that approximate the target domain and guide the model to forget it. In this work, we introduce a scalable, automated approach to generate high-quality forget sets using language models themselves. Our method synthesizes textbook-style data through a structured prompting pipeline, requiring only a domain name as input. Through experiments on unlearning biosecurity, cybersecurity, and Harry Potter novels, we show that our synthetic datasets consistently outperform the baseline synthetic alternatives and are comparable to the expert-curated ones. Additionally, ablation studies reveal that the multi-step generation pipeline significantly boosts data diversity, which in turn improves unlearning utility. Overall, our findings suggest that synthetic datasets offer a promising path toward practical, scalable unlearning for a wide range of emerging domains without the need for manual intervention. We release our code and dataset at https://github.com/xyzhu123/Synthetic_Textbook.
Related papers
- Meta-Learning and Synthetic Data for Automated Pretraining and Finetuning [2.657867981416885]
Growing number of pretrained models in Machine Learning (ML) presents significant challenges for practitioners.<n>As models grow in scale, the increasing reliance on real-world data poses a bottleneck for training and requires leveraging data more effectively.<n>This dissertation adopts meta-learning to extend automated machine learning to the deep learning domain.
arXiv Detail & Related papers (2025-06-11T12:48:45Z) - RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs [3.41612427812159]
In digital content creation tools, users express their needs through natural language queries that must be mapped to API calls.<n>Existing approaches to synthetic data generation fail to replicate real-world data distributions.<n>We present a novel router-based architecture that generates high-quality synthetic training data.
arXiv Detail & Related papers (2025-05-15T16:53:45Z) - Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation [43.81779293196647]
We show that standard fine-tuning methods can unexpectedly degrade effectiveness rather than improve it, even for domain-specific scenarios.<n>We explore a training strategy that uses listwise distillation from a teacher cross-encoder, leveraging rich relevance signals to fine-tune the retriever.<n>Our results also reveal that synthetic queries can rival human-written queries in training utility.
arXiv Detail & Related papers (2025-02-27T03:07:49Z) - Generate to Discriminate: Expert Routing for Continual Learning [59.71853576559306]
Generate to Discriminate (G2D) is a continual learning method that leverages synthetic data to train a domain-discriminator.<n>We observe that G2D outperforms competitive domain-incremental learning methods on tasks in both vision and language modalities.
arXiv Detail & Related papers (2024-12-22T13:16:28Z) - CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept [5.345828824625758]
We propose a novel amortized unlearning approach using codebook features and Sparse Autoencoders (SAEs)
By leveraging a bottleneck to decompose the activation space and regulate information flow, our method efficiently unlearns targeted information while preserving the model's performance on unrelated data.
arXiv Detail & Related papers (2024-10-08T10:26:22Z) - Synthetic continued pretraining [29.6872772403251]
We propose synthetic continued pretraining on a small corpus of domain-specific documents.
We instantiate this proposal with EntiGraph, a synthetic data augmentation algorithm.
We show how synthetic data augmentation can "rearrange" knowledge to enable more data-efficient learning.
arXiv Detail & Related papers (2024-09-11T17:21:59Z) - EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models [39.347666307218006]
Large language models (LLMs) have demonstrated remarkable in-context learning capabilities across diverse applications.<n>We introduce EPIC, a novel approach that leverages balanced, grouped data samples and consistent formatting with unique variable mapping to guide LLMs in generating accurate synthetic data across all classes, even for imbalanced datasets.
arXiv Detail & Related papers (2024-04-15T17:49:16Z) - TOFU: A Task of Fictitious Unlearning for LLMs [99.92305790945507]
Large language models trained on massive corpora of data from the web can reproduce sensitive or private data raising both legal and ethical concerns.
Unlearning, or tuning models to forget information present in their training data, provides us with a way to protect private data after training.
We present TOFU, a benchmark aimed at helping deepen our understanding of unlearning.
arXiv Detail & Related papers (2024-01-11T18:57:12Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - DomainMix: Learning Generalizable Person Re-Identification Without Human
Annotations [89.78473564527688]
This paper shows how to use labeled synthetic dataset and unlabeled real-world dataset to train a universal model.
In this way, human annotations are no longer required, and it is scalable to large and diverse real-world datasets.
Experimental results show that the proposed annotation-free method is more or less comparable to the counterpart trained with full human annotations.
arXiv Detail & Related papers (2020-11-24T08:15:53Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.