Game-Theoretic Unlearnable Example Generator
- URL: http://arxiv.org/abs/2401.17523v1
- Date: Wed, 31 Jan 2024 00:43:30 GMT
- Title: Game-Theoretic Unlearnable Example Generator
- Authors: Shuang Liu and Yihan Wang and Xiao-Shan Gao
- Abstract summary: Unlearnable example attacks aim to degrade the clean test accuracy of deep learning by adding imperceptible perturbations to the training samples.
In this paper, we investigate unlearnable example attacks from a game-theoretic perspective, by formulating the attack as a nonzero sum Stackelberg game.
We propose a novel attack method, called the Game Unlearnable Example (GUE), which has three main gradients.
- Score: 18.686469222136854
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unlearnable example attacks are data poisoning attacks aiming to degrade the
clean test accuracy of deep learning by adding imperceptible perturbations to
the training samples, which can be formulated as a bi-level optimization
problem. However, directly solving this optimization problem is intractable for
deep neural networks. In this paper, we investigate unlearnable example attacks
from a game-theoretic perspective, by formulating the attack as a nonzero sum
Stackelberg game. First, the existence of game equilibria is proved under the
normal setting and the adversarial training setting. It is shown that the game
equilibrium gives the most powerful poison attack in that the victim has the
lowest test accuracy among all networks within the same hypothesis space, when
certain loss functions are used. Second, we propose a novel attack method,
called the Game Unlearnable Example (GUE), which has three main gradients. (1)
The poisons are obtained by directly solving the equilibrium of the Stackelberg
game with a first-order algorithm. (2) We employ an autoencoder-like generative
network model as the poison attacker. (3) A novel payoff function is introduced
to evaluate the performance of the poison. Comprehensive experiments
demonstrate that GUE can effectively poison the model in various scenarios.
Furthermore, the GUE still works by using a relatively small percentage of the
training data to train the generator, and the poison generator can generalize
to unseen data well. Our implementation code can be found at
https://github.com/hong-xian/gue.
Related papers
- EvA: Evolutionary Attacks on Graphs [50.13398588415462]
Even a slight robustness in the graph structure can cause a significant drop in the accuracy of graph neural networks (GNNs)<n>We introduce a few simple yet effective enhancements of an evolutionary-based algorithm to solve the discrete optimization problem directly.<n>Among our experiments, EvA shows $sim$11% additional drop in accuracy on average compared to the best previous attack.
arXiv Detail & Related papers (2025-07-10T22:50:58Z) - Krait: A Backdoor Attack Against Graph Prompt Tuning [11.541661336266703]
Graph prompt tuning has emerged as a promising paradigm to effectively transfer general graph knowledge from pre-trained models to downstream tasks.
We conduct the first study to investigate such vulnerability, revealing that backdoors can disguise benign graph prompts, thus evading detection.
We introduce Krait, a novel graph prompt backdoor. Specifically, we propose a simple yet effective model-agnostic metric called label non-uniformity homophily poisoned to select candidates.
arXiv Detail & Related papers (2024-07-18T00:25:49Z) - Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks [11.390175856652856]
Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data.
We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate.
Our threat model poses a serious threat in training machine learning models with third-party datasets.
arXiv Detail & Related papers (2024-07-15T15:38:21Z) - Few-shot Backdoor Attacks via Neural Tangent Kernels [31.85706783674533]
In a backdoor attack, an attacker injects corrupted examples into the training set.
Central to these attacks is the trade-off between the success rate of the attack and the number of corrupted training examples injected.
We use neural tangent kernels to approximate the training dynamics of the model being attacked and automatically learn strong poison examples.
arXiv Detail & Related papers (2022-10-12T05:30:00Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in
Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning.
In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data.
We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z) - Poisoning Attack against Estimating from Pairwise Comparisons [140.9033911097995]
Attackers have strong motivation and incentives to manipulate the ranking list.
Data poisoning attacks on pairwise ranking algorithms can be formalized as the dynamic and static games between the ranker and the attacker.
We propose two efficient poisoning attack algorithms and establish the associated theoretical guarantees.
arXiv Detail & Related papers (2021-07-05T08:16:01Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - Anomaly Detection-Based Unknown Face Presentation Attack Detection [74.4918294453537]
Anomaly detection-based spoof attack detection is a recent development in face Presentation Attack Detection.
In this paper, we present a deep-learning solution for anomaly detection-based spoof attack detection.
The proposed approach benefits from the representation learning power of the CNNs and learns better features for fPAD task.
arXiv Detail & Related papers (2020-07-11T21:20:55Z) - Depth-2 Neural Networks Under a Data-Poisoning Attack [2.105564340986074]
We study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup.
In this work, we focus on doing supervised learning for a class of depth-2 finite-width neural networks.
arXiv Detail & Related papers (2020-05-04T17:56:15Z) - Weight Poisoning Attacks on Pre-trained Models [103.19413805873585]
We show that it is possible to construct weight poisoning'' attacks where pre-trained weights are injected with vulnerabilities that expose backdoors'' after fine-tuning.
Our experiments on sentiment classification, toxicity detection, and spam detection show that this attack is widely applicable and poses a serious threat.
arXiv Detail & Related papers (2020-04-14T16:51:42Z) - MetaPoison: Practical General-purpose Clean-label Data Poisoning [58.13959698513719]
Data poisoning is an emerging threat in the context of neural networks.
We propose MetaPoison, a first-order method that approximates the bilevel problem via meta-learning and crafts poisons that fool neural networks.
We demonstrate for the first time successful data poisoning of models trained on the black-box Google Cloud AutoML API.
arXiv Detail & Related papers (2020-04-01T04:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.