Detection and Defense of Unlearnable Examples
- URL: http://arxiv.org/abs/2312.08898v1
- Date: Thu, 14 Dec 2023 12:59:20 GMT
- Title: Detection and Defense of Unlearnable Examples
- Authors: Yifan Zhu and Lijia Yu and Xiao-Shan Gao
- Abstract summary: We provide theoretical results on linear separability of certain unlearnable poisoned dataset and simple network based detection methods.
We propose using stronger data augmentations coupled with adversarial noises generated by simple networks, to degrade the detectability.
- Score: 13.381207783432428
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Privacy preserving has become increasingly critical with the emergence of
social media. Unlearnable examples have been proposed to avoid leaking personal
information on the Internet by degrading generalization abilities of deep
learning models. However, our study reveals that unlearnable examples are
easily detectable. We provide theoretical results on linear separability of
certain unlearnable poisoned dataset and simple network based detection methods
that can identify all existing unlearnable examples, as demonstrated by
extensive experiments. Detectability of unlearnable examples with simple
networks motivates us to design a novel defense method. We propose using
stronger data augmentations coupled with adversarial noises generated by simple
networks, to degrade the detectability and thus provide effective defense
against unlearnable examples with a lower cost. Adversarial training with large
budgets is a widely-used defense method on unlearnable examples. We establish
quantitative criteria between the poison and adversarial budgets which
determine the existence of robust unlearnable examples or the failure of the
adversarial defense.
Related papers
- Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy [45.413801663923564]
We discuss adaptations of Membership Inference Attacks (MIAs) to the setting of unlearning.
We show that the commonly used U-MIAs in the unlearning literature overestimate the privacy protection afforded by existing unlearning techniques on both vision and language models.
arXiv Detail & Related papers (2024-03-02T14:22:40Z) - What Can We Learn from Unlearnable Datasets? [107.12337511216228]
Unlearnable datasets have the potential to protect data privacy by preventing deep neural networks from generalizing.
It is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization.
In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured.
arXiv Detail & Related papers (2023-05-30T17:41:35Z) - Adversarial Examples Detection with Enhanced Image Difference Features
based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy.
This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z) - Identifying Adversarially Attackable and Robust Samples [1.4213973379473654]
Adrial attacks insert small, imperceptible perturbations to input samples that cause large, undesired changes to the output of deep learning models.
This work introduces the notion of sample attackability, where we aim to identify samples that are most susceptible to adversarial attacks.
We propose a deep-learning-based detector to identify the adversarially attackable and robust samples in an unseen dataset for an unseen target model.
arXiv Detail & Related papers (2023-01-30T13:58:14Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Adversarial Examples for Unsupervised Machine Learning Models [71.81480647638529]
Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models.
We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
arXiv Detail & Related papers (2021-03-02T17:47:58Z) - Are Adversarial Examples Created Equal? A Learnable Weighted Minimax
Risk for Robustness under Non-uniform Attacks [70.11599738647963]
Adversarial Training is one of the few defenses that withstand strong attacks.
Traditional defense mechanisms assume a uniform attack over the examples according to the underlying data distribution.
We present a weighted minimax risk optimization that defends against non-uniform attacks.
arXiv Detail & Related papers (2020-10-24T21:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.