Related papers: UNLEARN Efficient Removal of Knowledge in Large Language Models

Related papers

Step-by-Step Reasoning Attack: Revealing 'Erased' Knowledge in Large Language Models [9.719371187651591]
Unlearning techniques suppress and leave the knowledge beneath the surface, thus making it retrievable with the right prompts.<n>We introduce a step-by-step reasoning-based black-box attack, Sleek, that systematically exposes unlearning failures.<n>Of the generated adversarial prompts, 62.5% successfully retrieved forgotten Harry Potter facts from WHP-unlearned Llama, while 50% exposed unfair suppression of retained knowledge.
arXiv Detail & Related papers (2025-06-14T04:22:17Z)
UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models [54.75551043657238]
We introduce UniErase, a novel unlearning paradigm that employs learnable parametric suffix (unlearning token) to steer language models toward targeted forgetting behaviors.<n>UniErase achieves state-of-the-art (SOTA) performance across batch, sequential, and precise unlearning under fictitious and real-world knowledge settings.
arXiv Detail & Related papers (2025-05-21T15:53:28Z)
UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets [41.0340052199534]
Large Language Models (LLMs) inevitably acquire harmful information during training on massive datasets. Existing unlearning methods focus on forgetting target data while overlooking the crucial impact of logically related knowledge on the effectiveness of unlearning. We propose Unlearning Improvement via Extrapolation (UIPE), a method that removes knowledge highly correlated with the forgetting targets.
arXiv Detail & Related papers (2025-03-06T18:40:00Z)
KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning [74.21524111840652]
This paper proposes textbfKaLM, a textitKnowledge-aligned Language Modeling approach. It fine-tunes autoregressive large language models to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment. Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks.
arXiv Detail & Related papers (2024-12-06T11:08:24Z)
KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [75.78948575957081]
Large language models (LLMs) usually rely on retrieval-augmented generation to exploit knowledge materials in an instant manner. We propose KBAlign, an approach designed for efficient adaptation to downstream tasks involving knowledge bases. Our method utilizes iterative training with self-annotated data such as Q&A pairs and revision suggestions, enabling the model to grasp the knowledge content efficiently.
arXiv Detail & Related papers (2024-11-22T08:21:03Z)
RESTOR: Knowledge Recovery in Machine Unlearning [71.75834077528305]
Large language models trained on web-scale corpora can contain private or sensitive information.<n>Several machine unlearning algorithms have been proposed to eliminate the effect of such datapoints.<n>We propose the RESTOR framework for machine unlearning evaluation.
arXiv Detail & Related papers (2024-10-31T20:54:35Z)
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models [51.20499954955646]
Large language models (LLMs) acquire vast amounts of knowledge from extensive text corpora during the pretraining phase. In later stages such as fine-tuning and inference, the model may encounter knowledge not covered in the initial training. We propose a two-stage fine-tuning strategy to improve the model's overall test accuracy and knowledge retention.
arXiv Detail & Related papers (2024-10-08T08:35:16Z)
Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models [19.015202590038996]
We design Dynamic Unlearning Attack (DUA), a dynamic and automated framework to attack unlearned models. We propose Latent Adrial Unlearning (LAU), a universal framework that effectively enhances the robustness of the unlearned process. We demonstrate that LAU improves unlearning effectiveness by over $53.5%$, cause only less than a $11.6%$ reduction in neighboring knowledge, and have almost no impact on the model's general capabilities.
arXiv Detail & Related papers (2024-08-20T09:36:04Z)
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning [18.283963879468466]
Large language models (LLMs) have demonstrated remarkable capabilities but still face challenges such as hallucinations. We propose a novel approach called uncertainty-sensitive tuning to improve models' capability to recognize the boundaries of their knowledge. Our experimental results demonstrate that our proposed uncertainty-sensitive tuning method enhance the model's ability to identify areas of uncertainty.
arXiv Detail & Related papers (2024-06-14T14:56:04Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models. It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning. This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z)
InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration [61.554209059971576]
Large Language Models (LLMs) have shown remarkable open-generation capabilities across diverse domains. Injecting new knowledge poses the risk of forgetting previously acquired knowledge. We propose a novel Infuser-Guided Knowledge Integration framework.
arXiv Detail & Related papers (2024-02-18T03:36:26Z)
Knowledge Distillation for Road Detection based on cross-model Semi-Supervised Learning [17.690698736544626]
We propose an integrated approach that combines knowledge distillation and semi-supervised learning methods. This hybrid approach leverages the robust capabilities of large models to effectively utilise large unlabelled data. The proposed semi-supervised learning-based knowledge distillation (SSLKD) approach demonstrates a notable improvement in the performance of the student model.
arXiv Detail & Related papers (2024-02-07T22:50:47Z)
Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models [53.52344131257681]
We propose a new paradigm for fine-tuning called F-Learning, which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge. Experimental results on two publicly available datasets demonstrate that our proposed F-Learning can obviously improve the knowledge updating performance of both full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2023-11-14T09:12:40Z)
Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference [20.5696436171006]
Most existing studies attribute it to catastrophic forgetting, and they retain the pre-trained knowledge indiscriminately. We frame fine-tuning into a causal graph and discover that the crux of catastrophic forgetting lies in the missing causal effects from the pretrained data. In the experiments, our method outperforms state-of-the-art fine-tuning methods on all six commonsense QA datasets.
arXiv Detail & Related papers (2023-06-19T09:06:44Z)
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks [90.11273439036455]
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks. We propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales from LLMs with augmented knowledge retrieved from an external knowledge base. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets.
arXiv Detail & Related papers (2023-05-28T13:00:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.