Wasserstein distributional robustness of neural networks
- URL: http://arxiv.org/abs/2306.09844v1
- Date: Fri, 16 Jun 2023 13:41:24 GMT
- Title: Wasserstein distributional robustness of neural networks
- Authors: Xingjian Bai, Guangyi He, Yifan Jiang, Jan Obloj
- Abstract summary: Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
- Score: 9.79503506460041
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks are known to be vulnerable to adversarial attacks (AA).
For an image recognition task, this means that a small perturbation of the
original can result in the image being misclassified. Design of such attacks as
well as methods of adversarial training against them are subject of intense
research. We re-cast the problem using techniques of Wasserstein
distributionally robust optimization (DRO) and obtain novel contributions
leveraging recent insights from DRO sensitivity analysis. We consider a set of
distributional threat models. Unlike the traditional pointwise attacks, which
assume a uniform bound on perturbation of each input data point, distributional
threat models allow attackers to perturb inputs in a non-uniform way. We link
these more general attacks with questions of out-of-sample performance and
Knightian uncertainty. To evaluate the distributional robustness of neural
networks, we propose a first-order AA algorithm and its multi-step version. Our
attack algorithms include Fast Gradient Sign Method (FGSM) and Projected
Gradient Descent (PGD) as special cases. Furthermore, we provide a new
asymptotic estimate of the adversarial accuracy against distributional threat
models. The bound is fast to compute and first-order accurate, offering new
insights even for the pointwise AA. It also naturally yields out-of-sample
performance guarantees. We conduct numerical experiments on the CIFAR-10
dataset using DNNs on RobustBench to illustrate our theoretical results. Our
code is available at https://github.com/JanObloj/W-DRO-Adversarial-Methods.
Related papers
- Adversarial Attacks Neutralization via Data Set Randomization [3.655021726150369]
Adversarial attacks on deep learning models pose a serious threat to their reliability and security.
We propose a new defense mechanism that is rooted on hyperspace projection.
We show that our solution increases the robustness of deep learning models against adversarial attacks.
arXiv Detail & Related papers (2023-06-21T10:17:55Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - Improving Transformation-based Defenses against Adversarial Examples
with First-order Perturbations [16.346349209014182]
Studies show that neural networks are susceptible to adversarial attacks.
This exposes a potential threat to neural network-based intelligent systems.
We propose a method for counteracting adversarial perturbations to improve adversarial robustness.
arXiv Detail & Related papers (2021-03-08T06:27:24Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Depth-2 Neural Networks Under a Data-Poisoning Attack [2.105564340986074]
We study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup.
In this work, we focus on doing supervised learning for a class of depth-2 finite-width neural networks.
arXiv Detail & Related papers (2020-05-04T17:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.