Theoretical Understanding of Learning from Adversarial Perturbations
- URL: http://arxiv.org/abs/2402.10470v1
- Date: Fri, 16 Feb 2024 06:22:44 GMT
- Title: Theoretical Understanding of Learning from Adversarial Perturbations
- Authors: Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki
- Abstract summary: It is not fully understood why adversarial examples can deceive neural networks and transfer between different networks.
We provide a theoretical framework for understanding learning from perturbations using a one-hidden-layer network.
Our results highlight that various adversarial perturbations, even perturbations of a few pixels, contain sufficient class features for generalization.
- Score: 30.759348459463467
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: It is not fully understood why adversarial examples can deceive neural
networks and transfer between different networks. To elucidate this, several
studies have hypothesized that adversarial perturbations, while appearing as
noises, contain class features. This is supported by empirical evidence showing
that networks trained on mislabeled adversarial examples can still generalize
well to correctly labeled test samples. However, a theoretical understanding of
how perturbations include class features and contribute to generalization is
limited. In this study, we provide a theoretical framework for understanding
learning from perturbations using a one-hidden-layer network trained on
mutually orthogonal samples. Our results highlight that various adversarial
perturbations, even perturbations of a few pixels, contain sufficient class
features for generalization. Moreover, we reveal that the decision boundary
when learning from perturbations matches that from standard samples except for
specific regions under mild conditions. The code is available at
https://github.com/s-kumano/learning-from-adversarial-perturbations.
Related papers
- Wide Two-Layer Networks can Learn from Adversarial Perturbations [27.368408524000778]
We theoretically explain the counterintuitive success of perturbation learning.
We prove that adversarial perturbations contain sufficient class-specific features for networks to generalize from them.
arXiv Detail & Related papers (2024-10-31T06:55:57Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - The Missing Margin: How Sample Corruption Affects Distance to the
Boundary in ANNs [2.65558931169264]
We show that some types of training samples are modelled with consistently small margins while affecting generalization in different ways.
We support our findings with an analysis of fully-connected networks trained on noise-corrupted MNIST data, as well as convolutional networks trained on noise-corrupted CIFAR10 data.
arXiv Detail & Related papers (2023-02-14T09:25:50Z) - Meet You Halfway: Explaining Deep Learning Mysteries [0.0]
We introduce a new conceptual framework attached with a formal description that aims to shed light on the network's behavior.
We clarify: Why do neural networks acquire generalization abilities?
We provide a comprehensive set of experiments that support this new framework, as well as its underlying theory.
arXiv Detail & Related papers (2022-06-09T12:43:10Z) - Predicting Unreliable Predictions by Shattering a Neural Network [145.3823991041987]
Piecewise linear neural networks can be split into subfunctions.
Subfunctions have their own activation pattern, domain, and empirical error.
Empirical error for the full network can be written as an expectation over subfunctions.
arXiv Detail & Related papers (2021-06-15T18:34:41Z) - How benign is benign overfitting? [96.07549886487526]
We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models.
Deep neural networks essentially achieve zero training error, even in the presence of label noise.
We identify label noise as one of the causes for adversarial vulnerability.
arXiv Detail & Related papers (2020-07-08T11:07:10Z) - Learning from Failure: Training Debiased Classifier from Biased
Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge.
We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously.
Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification.
We empirically show that embedding propagation yields a smoother embedding manifold.
We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z) - Analyzing the Noise Robustness of Deep Neural Networks [43.63911131982369]
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions.
We present a visual analysis method to explain why adversarial examples are misclassified.
arXiv Detail & Related papers (2020-01-26T03:39:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.