KGA: A General Machine Unlearning Framework Based on Knowledge Gap
Alignment
- URL: http://arxiv.org/abs/2305.06535v1
- Date: Thu, 11 May 2023 02:44:29 GMT
- Title: KGA: A General Machine Unlearning Framework Based on Knowledge Gap
Alignment
- Authors: Lingzhi Wang, Tong Chen, Wei Yuan, Xingshan Zeng, Kam-Fai Wong,
Hongzhi Yin
- Abstract summary: We propose a general unlearning framework called KGA to induce forgetfulness.
Experiments on large-scale datasets show that KGA yields comprehensive improvements over baselines.
- Score: 51.15802100354848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent legislation of the "right to be forgotten" has led to the interest in
machine unlearning, where the learned models are endowed with the function to
forget information about specific training instances as if they have never
existed in the training set. Previous work mainly focuses on computer vision
scenarios and largely ignores the essentials of unlearning in NLP field, where
text data contains more explicit and sensitive personal information than
images. In this paper, we propose a general unlearning framework called KGA to
induce forgetfulness. Different from previous work that tries to recover
gradients or forces models to perform close to one specific distribution, KGA
maintains distribution differences (i.e., knowledge gap). This relaxes the
distribution assumption. Furthermore, we first apply the unlearning method to
various NLP tasks (i.e., classification, translation, response generation) and
propose several unlearning evaluation metrics with pertinence. Experiments on
large-scale datasets show that KGA yields comprehensive improvements over
baselines, where extensive analyses further validate the effectiveness of KGA
and provide insight into unlearning for NLP tasks.
Related papers
- Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - TOFU: A Task of Fictitious Unlearning for LLMs [99.92305790945507]
Large language models trained on massive corpora of data from the web can reproduce sensitive or private data raising both legal and ethical concerns.
Unlearning, or tuning models to forget information present in their training data, provides us with a way to protect private data after training.
We present TOFU, a benchmark aimed at helping deepen our understanding of unlearning.
arXiv Detail & Related papers (2024-01-11T18:57:12Z) - Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings.
To address these concerns, privacy-preserving graph embedding methods have emerged.
We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z) - Learning the Finer Things: Bayesian Structure Learning at the
Instantiation Level [0.0]
Successful machine learning methods require a trade-off between memorization and generalization.
We present a novel probabilistic graphical model structure learning approach that can learn, generalize and explain in elusive domains.
arXiv Detail & Related papers (2023-03-08T02:31:49Z) - Self-supervised on Graphs: Contrastive, Generative,or Predictive [25.679620842010422]
Self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks.
We divide existing graph SSL methods into three categories: contrastive, generative, and predictive.
We also summarize the commonly used datasets, evaluation metrics, downstream tasks, and open-source implementations of various algorithms.
arXiv Detail & Related papers (2021-05-16T03:30:03Z) - Few-Shot Incremental Learning with Continually Evolved Classifiers [46.278573301326276]
Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points.
The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbate the notorious catastrophic forgetting problems.
We propose a Continually Evolved CIF ( CEC) that employs a graph model to propagate context information between classifiers for adaptation.
arXiv Detail & Related papers (2021-04-07T10:54:51Z) - Privileged Knowledge Distillation for Online Action Detection [114.5213840651675]
Online Action Detection (OAD) in videos is proposed as a per-frame labeling task to address the real-time prediction tasks.
This paper presents a novel learning-with-privileged based framework for online action detection where the future frames only observable at the training stages are considered as a form of privileged information.
arXiv Detail & Related papers (2020-11-18T08:52:15Z) - Active Learning for Skewed Data Sets [25.866341631677688]
We focus on problems with two distinguishing characteristics: severe class imbalance (skew) and small amounts of initial training data.
We propose a hybrid active learning algorithm (HAL) that balances exploiting the knowledge available through the currently labeled training examples with exploring the large amount of unlabeled data.
arXiv Detail & Related papers (2020-05-23T01:50:19Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.