Related papers: Towards Understanding the Dynamics of the First-Order Adversaries

Towards Understanding the Dynamics of the First-Order Adversaries

URL: http://arxiv.org/abs/2010.10650v1
Date: Tue, 20 Oct 2020 22:20:53 GMT
Title: Towards Understanding the Dynamics of the First-Order Adversaries
Authors: Zhun Deng, Hangfeng He, Jiaoyang Huang, Weijie J. Su
Abstract summary: An acknowledged weakness of neural networks is their vulnerability to adversarial perturbations to the inputs. One of the most popular defense mechanisms is to maximize the loss over the constrained perturbations on the inputs using projected ascent and minimize over weights. We investigate the non-concave landscape of the adversaries for a two-layer neural network with a quadratic loss.
Score: 40.54670072901657
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: An acknowledged weakness of neural networks is their vulnerability to adversarial perturbations to the inputs. To improve the robustness of these models, one of the most popular defense mechanisms is to alternatively maximize the loss over the constrained perturbations (or called adversaries) on the inputs using projected gradient ascent and minimize over weights. In this paper, we analyze the dynamics of the maximization step towards understanding the experimentally observed effectiveness of this defense mechanism. Specifically, we investigate the non-concave landscape of the adversaries for a two-layer neural network with a quadratic loss. Our main result proves that projected gradient ascent finds a local maximum of this non-concave problem in a polynomial number of iterations with high probability. To our knowledge, this is the first work that provides a convergence analysis of the first-order adversaries. Moreover, our analysis demonstrates that, in the initial phase of adversarial training, the scale of the inputs matters in the sense that a smaller input scale leads to faster convergence of adversarial training and a "more regular" landscape. Finally, we show that these theoretical findings are in excellent agreement with a series of experiments.

Related papers

Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization [36.72245290832128]
We identify a new phenomenon in neural network optimization which arises from the interaction of depth and a heavytailed structure in natural data. In particular, it implies a conceptually new cause for progressive sharpening and the edge of stability. We demonstrate the significant influence of paired groups of outliers in the training data with strong opposing signals.
arXiv Detail & Related papers (2023-11-07T17:43:50Z)
A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions. The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model. This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z)
Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance. Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z)
Second Order Optimization for Adversarial Robustness and Interpretability [6.700873164609009]
We propose a novel regularizer which incorporates first and second order information via a quadratic approximation to the adversarial loss. It is shown that using only a single iteration in our regularizer achieves stronger robustness than prior gradient and curvature regularization schemes. It retains the interesting facet of AT that networks learn features which are well-aligned with human perception.
arXiv Detail & Related papers (2020-09-10T15:05:14Z)
Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation. We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)
Improving Adversarial Robustness by Enforcing Local and Global Compactness [19.8818435601131]
Adversary training is the most successful method that consistently resists a wide range of attacks. We propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption. The experimental results demonstrate that augmenting adversarial training with our proposed components can further improve the robustness of the network.
arXiv Detail & Related papers (2020-07-10T00:43:06Z)
On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models. We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z)
Perturbation Analysis of Gradient-based Adversarial Attacks [2.3016608994689274]
We investigate the objective functions of three popular methods for adversarial example generation: the L-BFGS attack, the Iterative Fast Gradient Sign attack, and Carlini & Wagner's attack (CW) Specifically, we perform a comparative and formal analysis of the loss functions underlying the aforementioned attacks while laying out large-scale experimental results on ImageNet dataset. Our experiments reveal that the Iterative Fast Gradient Sign attack, which is thought to be fast for generating adversarial examples, is the worst attack in terms of the number of iterations required to create adversarial examples.
arXiv Detail & Related papers (2020-06-02T08:51:37Z)
Feature Purification: How Adversarial Training Performs Robust Deep Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network. We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z)
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.