Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning
- URL: http://arxiv.org/abs/2405.15662v1
- Date: Fri, 24 May 2024 15:59:17 GMT
- Title: Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning
- Authors: Wenhan Chang, Tianqing Zhu, Heng Xu, Wenjian Liu, Wanlei Zhou,
- Abstract summary: In current AI era, users may request AI companies to delete their data from the training dataset due to privacy concerns.
Machine unlearning is a new emerged technology to allow model owner to delete requested training data or a class with little affecting on the model performance.
In this paper, we apply the definition of Concept, rather than an image feature or a token of text data, to represent the semantic information of unlearning class.
- Score: 15.364125432212711
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In current AI era, users may request AI companies to delete their data from the training dataset due to the privacy concerns. As a model owner, retraining a model will consume significant computational resources. Therefore, machine unlearning is a new emerged technology to allow model owner to delete requested training data or a class with little affecting on the model performance. However, for large-scaling complex data, such as image or text data, unlearning a class from a model leads to a inferior performance due to the difficulty to identify the link between classes and model. An inaccurate class deleting may lead to over or under unlearning. In this paper, to accurately defining the unlearning class of complex data, we apply the definition of Concept, rather than an image feature or a token of text data, to represent the semantic information of unlearning class. This new representation can cut the link between the model and the class, leading to a complete erasing of the impact of a class. To analyze the impact of the concept of complex data, we adopt a Post-hoc Concept Bottleneck Model, and Integrated Gradients to precisely identify concepts across different classes. Next, we take advantage of data poisoning with random and targeted labels to propose unlearning methods. We test our methods on both image classification models and large language models (LLMs). The results consistently show that the proposed methods can accurately erase targeted information from models and can largely maintain the performance of the models.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Update Selective Parameters: Federated Machine Unlearning Based on Model Explanation [46.86767774669831]
We propose a more effective and efficient federated unlearning scheme based on the concept of model explanation.
We select the most influential channels within an already-trained model for the data that need to be unlearned.
arXiv Detail & Related papers (2024-06-18T11:43:20Z) - An Information Theoretic Approach to Machine Unlearning [45.600917449314444]
Key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance.
In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten.
We derive a simple but principled zero-shot unlearning method based on the geometry of the model.
arXiv Detail & Related papers (2024-02-02T13:33:30Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - LLM2Loss: Leveraging Language Models for Explainable Model Diagnostics [5.33024001730262]
We propose an approach that can provide semantic insights into a model's patterns of failures and biases.
We show that an ensemble of such lightweight models can be used to generate insights on the performance of the black-box model.
arXiv Detail & Related papers (2023-05-04T23:54:37Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Interpretable ML for Imbalanced Data [22.355966235617014]
Imbalanced data compounds the black-box nature of deep networks because the relationships between classes may be skewed and unclear.
Existing methods that investigate imbalanced data complexity are geared toward binary classification, shallow learning models and low dimensional data.
We propose a set of techniques that can be used by both deep learning model users to identify, visualize and understand class prototypes, sub-concepts and outlier instances.
arXiv Detail & Related papers (2022-12-15T11:50:31Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Certified Data Removal from Machine Learning Models [79.91502073022602]
Good data stewardship requires removal of data at the request of the data's owner.
This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request.
We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with.
arXiv Detail & Related papers (2019-11-08T03:57:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.