Constrained Gradient Descent: A Powerful and Principled Evasion Attack
Against Neural Networks
- URL: http://arxiv.org/abs/2112.14232v1
- Date: Tue, 28 Dec 2021 17:36:58 GMT
- Title: Constrained Gradient Descent: A Powerful and Principled Evasion Attack
Against Neural Networks
- Authors: Weiran Lin, Keane Lucas, Lujo Bauer, Michael K. Reiter and Mahmood
Sharif
- Abstract summary: We introduce several innovations that make white-box targeted attacks follow the intuition of the attacker's goal.
First, we propose a new loss function that explicitly captures the goal of targeted attacks.
Second, we propose a new attack method that uses a further developed version of our loss function capturing both the misclassification objective and the $L_infty$ distance limit.
- Score: 19.443306494201334
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Minimal adversarial perturbations added to inputs have been shown to be
effective at fooling deep neural networks. In this paper, we introduce several
innovations that make white-box targeted attacks follow the intuition of the
attacker's goal: to trick the model to assign a higher probability to the
target class than to any other, while staying within a specified distance from
the original input. First, we propose a new loss function that explicitly
captures the goal of targeted attacks, in particular, by using the logits of
all classes instead of just a subset, as is common. We show that Auto-PGD with
this loss function finds more adversarial examples than it does with other
commonly used loss functions. Second, we propose a new attack method that uses
a further developed version of our loss function capturing both the
misclassification objective and the $L_{\infty}$ distance limit $\epsilon$.
This new attack method is relatively 1.5--4.2% more successful on the CIFAR10
dataset and relatively 8.2--14.9% more successful on the ImageNet dataset, than
the next best state-of-the-art attack. We confirm using statistical tests that
our attack outperforms state-of-the-art attacks on different datasets and
values of $\epsilon$ and against different defenses.
Related papers
- Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection [83.72430401516674]
GAKer is able to construct adversarial examples to any target class.
Our method achieves an approximately $14.13%$ higher attack success rate for unknown classes.
arXiv Detail & Related papers (2024-07-17T03:24:09Z) - Adversarial Attacks Neutralization via Data Set Randomization [3.655021726150369]
Adversarial attacks on deep learning models pose a serious threat to their reliability and security.
We propose a new defense mechanism that is rooted on hyperspace projection.
We show that our solution increases the robustness of deep learning models against adversarial attacks.
arXiv Detail & Related papers (2023-06-21T10:17:55Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Projective Ranking-based GNN Evasion Attacks [52.85890533994233]
Graph neural networks (GNNs) offer promising learning methods for graph-related tasks.
GNNs are at risk of adversarial attacks.
arXiv Detail & Related papers (2022-02-25T21:52:09Z) - Unreasonable Effectiveness of Last Hidden Layer Activations [0.5156484100374058]
We show that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases.
We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets.
arXiv Detail & Related papers (2022-02-15T12:02:59Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - GreedyFool: Distortion-Aware Sparse Adversarial Attack [138.55076781355206]
Modern deep neural networks (DNNs) are vulnerable to adversarial samples.
Sparse adversarial samples can fool the target model by only perturbing a few pixels.
We propose a novel two-stage distortion-aware greedy-based method dubbed as "GreedyFool"
arXiv Detail & Related papers (2020-10-26T17:59:07Z) - Minimax Defense against Gradient-based Adversarial Attacks [2.4403071643841243]
We introduce a novel approach that uses minimax optimization to foil gradient-based adversarial attacks.
Our minimax defense achieves 98.07% (MNIST-default 98.93%), 73.90% (CIFAR-10-default 83.14%) and 94.54% (TRAFFIC-default 96.97%)
Our Minimax adversarial approach presents a significant shift in defense strategy for neural network classifiers.
arXiv Detail & Related papers (2020-02-04T12:33:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.