Related papers: Theoretical Understanding of Learning from Adversarial Perturbations

Theoretical Understanding of Learning from Adversarial Perturbations

URL: http://arxiv.org/abs/2402.10470v1
Date: Fri, 16 Feb 2024 06:22:44 GMT
Title: Theoretical Understanding of Learning from Adversarial Perturbations
Authors: Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki
Abstract summary: It is not fully understood why adversarial examples can deceive neural networks and transfer between different networks. We provide a theoretical framework for understanding learning from perturbations using a one-hidden-layer network. Our results highlight that various adversarial perturbations, even perturbations of a few pixels, contain sufficient class features for generalization.
Score: 30.759348459463467
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: It is not fully understood why adversarial examples can deceive neural networks and transfer between different networks. To elucidate this, several studies have hypothesized that adversarial perturbations, while appearing as noises, contain class features. This is supported by empirical evidence showing that networks trained on mislabeled adversarial examples can still generalize well to correctly labeled test samples. However, a theoretical understanding of how perturbations include class features and contribute to generalization is limited. In this study, we provide a theoretical framework for understanding learning from perturbations using a one-hidden-layer network trained on mutually orthogonal samples. Our results highlight that various adversarial perturbations, even perturbations of a few pixels, contain sufficient class features for generalization. Moreover, we reveal that the decision boundary when learning from perturbations matches that from standard samples except for specific regions under mild conditions. The code is available at https://github.com/s-kumano/learning-from-adversarial-perturbations.

Related papers

Wide Two-Layer Networks can Learn from Adversarial Perturbations [27.368408524000778]
We theoretically explain the counterintuitive success of perturbation learning. We prove that adversarial perturbations contain sufficient class-specific features for networks to generalize from them.
arXiv Detail & Related papers (2024-10-31T06:55:57Z)
Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z)
The Missing Margin: How Sample Corruption Affects Distance to the Boundary in ANNs [2.65558931169264]
We show that some types of training samples are modelled with consistently small margins while affecting generalization in different ways. We support our findings with an analysis of fully-connected networks trained on noise-corrupted MNIST data, as well as convolutional networks trained on noise-corrupted CIFAR10 data.
arXiv Detail & Related papers (2023-02-14T09:25:50Z)
Meet You Halfway: Explaining Deep Learning Mysteries [0.0]
We introduce a new conceptual framework attached with a formal description that aims to shed light on the network's behavior. We clarify: Why do neural networks acquire generalization abilities? We provide a comprehensive set of experiments that support this new framework, as well as its underlying theory.
arXiv Detail & Related papers (2022-06-09T12:43:10Z)
Predicting Unreliable Predictions by Shattering a Neural Network [145.3823991041987]
Piecewise linear neural networks can be split into subfunctions. Subfunctions have their own activation pattern, domain, and empirical error. Empirical error for the full network can be written as an expectation over subfunctions.
arXiv Detail & Related papers (2021-06-15T18:34:41Z)
How benign is benign overfitting? [96.07549886487526]
We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models. Deep neural networks essentially achieve zero training error, even in the presence of label noise. We identify label noise as one of the causes for adversarial vulnerability.
arXiv Detail & Related papers (2020-07-08T11:07:10Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks. We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task. Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
Embedding Propagation: Smoother Manifold for Few-Shot Classification [131.81692677836202]
We propose to use embedding propagation as an unsupervised non-parametric regularizer for manifold smoothing in few-shot classification. We empirically show that embedding propagation yields a smoother embedding manifold. We show that embedding propagation consistently improves the accuracy of the models in multiple semi-supervised learning scenarios by up to 16% points.
arXiv Detail & Related papers (2020-03-09T13:51:09Z)
Analyzing the Noise Robustness of Deep Neural Networks [43.63911131982369]
Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. We present a visual analysis method to explain why adversarial examples are misclassified.
arXiv Detail & Related papers (2020-01-26T03:39:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.