Related papers: On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

URL: http://arxiv.org/abs/2307.16099v1
Date: Sun, 30 Jul 2023 01:04:36 GMT
Title: On Neural Network approximation of ideal adversarial attack and convergence of adversarial training
Authors: Rajdeep Haldar and Qifan Song
Abstract summary: Adversarial attacks are usually expressed in terms of a gradient-based operation on the input data and model. In this work, we solidify the idea of representing adversarial attacks as a trainable function, without further computation.
Score: 3.553493344868414
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial attacks are usually expressed in terms of a gradient-based operation on the input data and model, this results in heavy computations every time an attack is generated. In this work, we solidify the idea of representing adversarial attacks as a trainable function, without further gradient computation. We first motivate that the theoretical best attacks, under proper conditions, can be represented as smooth piece-wise functions (piece-wise H\"older functions). Then we obtain an approximation result of such functions by a neural network. Subsequently, we emulate the ideal attack process by a neural network and reduce the adversarial training to a mathematical game between an attack network and a training model (a defense network). We also obtain convergence rates of adversarial loss in terms of the sample size $n$ for adversarial training in such a setting.

Related papers

Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples. We propose to exploit the interior building blocks of the model to improve efficiency. Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
Thundernna: a white box adversarial attack [0.0]
We develop a first-order method to attack the neural network. Compared with other first-order attacks, our method has a much higher success rate.
arXiv Detail & Related papers (2021-11-24T07:06:21Z)
Defensive Tensorization [113.96183766922393]
We propose tensor defensiveization, an adversarial defence technique that leverages a latent high-order factorization of the network. We empirically demonstrate the effectiveness of our approach on standard image classification benchmarks. We validate the versatility of our approach across domains and low-precision architectures by considering an audio task and binary networks.
arXiv Detail & Related papers (2021-10-26T17:00:16Z)
FooBaR: Fault Fooling Backdoor Attack on Neural Network Training [5.639451539396458]
We explore a novel attack paradigm by injecting faults during the training phase of a neural network in a way that the resulting network can be attacked during deployment without the necessity of further faulting. We call such attacks fooling backdoors as the fault attacks at the training phase inject backdoors into the network that allow an attacker to produce fooling inputs.
arXiv Detail & Related papers (2021-09-23T09:43:19Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
REGroup: Rank-aggregating Ensemble of Generative Classifiers for Robust Predictions [6.0162772063289784]
Defense strategies that adopt adversarial training or random input transformations typically require retraining or fine-tuning the model to achieve reasonable performance. We find that we can learn a generative classifier by statistically characterizing the neural response of an intermediate layer to clean training samples. Our proposed approach uses a subset of the clean training data and a pre-trained model, and yet is agnostic to network architectures or the adversarial attack generation method.
arXiv Detail & Related papers (2020-06-18T17:07:19Z)
Feature Purification: How Adversarial Training Performs Robust Deep Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network. We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z)
Depth-2 Neural Networks Under a Data-Poisoning Attack [2.105564340986074]
We study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. In this work, we focus on doing supervised learning for a class of depth-2 finite-width neural networks.
arXiv Detail & Related papers (2020-05-04T17:56:15Z)
Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction. We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.