Related papers: Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference

Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference

URL: http://arxiv.org/abs/2306.10790v1
Date: Mon, 19 Jun 2023 09:06:44 GMT
Title: Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference
Authors: Junhao Zheng, Qianli Ma, Shengjie Qiu, Yue Wu, Peitian Ma, Junlong Liu, Huawen Feng, Xichen Shang and Haibin Chen
Abstract summary: Most existing studies attribute it to catastrophic forgetting, and they retain the pre-trained knowledge indiscriminately. We frame fine-tuning into a causal graph and discover that the crux of catastrophic forgetting lies in the missing causal effects from the pretrained data. In the experiments, our method outperforms state-of-the-art fine-tuning methods on all six commonsense QA datasets.
Score: 20.5696436171006
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-tuning has been proven to be a simple and effective technique to transfer the learned knowledge of Pre-trained Language Models (PLMs) to downstream tasks. However, vanilla fine-tuning easily overfits the target data and degrades the generalization ability. Most existing studies attribute it to catastrophic forgetting, and they retain the pre-trained knowledge indiscriminately without identifying what knowledge is transferable. Motivated by this, we frame fine-tuning into a causal graph and discover that the crux of catastrophic forgetting lies in the missing causal effects from the pretrained data. Based on the causal view, we propose a unified objective for fine-tuning to retrieve the causality back. Intriguingly, the unified objective can be seen as the sum of the vanilla fine-tuning objective, which learns new knowledge from target data, and the causal objective, which preserves old knowledge from PLMs. Therefore, our method is flexible and can mitigate negative transfer while preserving knowledge. Since endowing models with commonsense is a long-standing challenge, we implement our method on commonsense QA with a proposed heuristic estimation to verify its effectiveness. In the experiments, our method outperforms state-of-the-art fine-tuning methods on all six commonsense QA datasets and can be implemented as a plug-in module to inflate the performance of existing QA models.

Related papers

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs [4.447729258258283]
We study the factuality gap that arises when fine-tuning on known versus unknown knowledge.<n>Our results shed light on the interaction between finetuning data and test-time prompt.
arXiv Detail & Related papers (2025-05-29T12:59:30Z)
Graceful Forgetting in Generative Language Models [19.413048064877824]
We propose a novel framework, Learning With Forgetting, to achieve graceful forgetting in generative language models.<n>With Fisher Information Matrix weighting the intended parameter updates, LWF computes forgetting confidence to evaluate self-generated knowledge regarding the forgetting task.<n>Our experiments demonstrate that, although thoroughly uncovering the mechanisms of knowledge interaction remains challenging in pre-trained language models, applying graceful forgetting can contribute to enhanced fine-tuning performance.
arXiv Detail & Related papers (2025-05-26T09:03:57Z)
UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models [54.75551043657238]
We introduce UniErase, a novel unlearning paradigm that employs learnable parametric suffix (unlearning token) to steer language models toward targeted forgetting behaviors.<n>UniErase achieves state-of-the-art (SOTA) performance across batch, sequential, and precise unlearning under fictitious and real-world knowledge settings.
arXiv Detail & Related papers (2025-05-21T15:53:28Z)
UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets [41.0340052199534]
Large Language Models (LLMs) inevitably acquire harmful information during training on massive datasets. Existing unlearning methods focus on forgetting target data while overlooking the crucial impact of logically related knowledge on the effectiveness of unlearning. We propose Unlearning Improvement via Extrapolation (UIPE), a method that removes knowledge highly correlated with the forgetting targets.
arXiv Detail & Related papers (2025-03-06T18:40:00Z)
KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [75.78948575957081]
Large language models (LLMs) usually rely on retrieval-augmented generation to exploit knowledge materials in an instant manner. We propose KBAlign, an approach designed for efficient adaptation to downstream tasks involving knowledge bases. Our method utilizes iterative training with self-annotated data such as Q&A pairs and revision suggestions, enabling the model to grasp the knowledge content efficiently.
arXiv Detail & Related papers (2024-11-22T08:21:03Z)
Dissecting Fine-Tuning Unlearning in Large Language Models [12.749301272512222]
Fine-tuning-based unlearning methods prevail for preventing harmful, sensitive, or copyrighted information within large language models. However, the true effectiveness of these methods is unclear. In this work, we delve into the limitations of fine-tuning-based unlearning through activation patching and restoration experiments.
arXiv Detail & Related papers (2024-10-09T06:58:09Z)
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models [51.20499954955646]
Large language models (LLMs) acquire vast amounts of knowledge from extensive text corpora during the pretraining phase. In later stages such as fine-tuning and inference, the model may encounter knowledge not covered in the initial training. We propose a two-stage fine-tuning strategy to improve the model's overall test accuracy and knowledge retention.
arXiv Detail & Related papers (2024-10-08T08:35:16Z)
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs [25.91643745340183]
Large Language Models (LLMs) have demonstrated strong reasoning and memorization capabilities via pretraining on massive textual corpora. This poses risk of privacy and copyright violations, highlighting the need for efficient machine unlearning methods. We propose Low-rank Knowledge Unlearning (LoKU), a novel framework that enables robust and efficient unlearning for LLMs.
arXiv Detail & Related papers (2024-08-13T04:18:32Z)
UNLEARN Efficient Removal of Knowledge in Large Language Models [1.9797215742507548]
This paper proposes a novel method to achieve this objective called UNLEARN. The approach builds upon subspace methods to identify and specifically target the removal of knowledge without adversely affecting other knowledge in the LLM. Results demonstrate 96% of targeted knowledge can be forgotten while maintaining performance on other knowledge within 2.5% of the original model.
arXiv Detail & Related papers (2024-08-08T00:53:31Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
Decoupling the Class Label and the Target Concept in Machine Unlearning [81.69857244976123]
Machine unlearning aims to adjust a trained model to approximate a retrained one that excludes a portion of training data. Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class. We propose a general framework, namely, TARget-aware Forgetting (TARF)
arXiv Detail & Related papers (2024-06-12T14:53:30Z)
R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not. We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning) Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z)
Towards Causal Foundation Model: on Duality between Causal Inference and Attention [18.046388712804042]
We take a first step towards building causally-aware foundation models for treatment effect estimations. We propose a novel, theoretically justified method called Causal Inference with Attention (CInA)
arXiv Detail & Related papers (2023-10-01T22:28:34Z)
Principled Knowledge Extrapolation with GANs [92.62635018136476]
We study counterfactual synthesis from a new perspective of knowledge extrapolation. We show that an adversarial game with a closed-form discriminator can be used to address the knowledge extrapolation problem. Our method enjoys both elegant theoretical guarantees and superior performance in many scenarios.
arXiv Detail & Related papers (2022-05-21T08:39:42Z)
Self-Supervised Learning Aided Class-Incremental Lifelong Learning [17.151579393716958]
We study the issue of catastrophic forgetting in class-incremental learning (Class-IL) In training procedure of Class-IL, as the model has no knowledge about following tasks, it would only extract features necessary for tasks learned so far, whose information is insufficient for joint classification. We propose to combine self-supervised learning, which can provide effective representations without requiring labels, with Class-IL to partly get around this problem.
arXiv Detail & Related papers (2020-06-10T15:15:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.