Certified Robust Neural Networks: Generalization and Corruption
Resistance
- URL: http://arxiv.org/abs/2303.02251v2
- Date: Thu, 18 May 2023 14:28:52 GMT
- Title: Certified Robust Neural Networks: Generalization and Corruption
Resistance
- Authors: Amine Bennouna, Ryan Lucas, Bart Van Parys
- Abstract summary: Adversarial training aims to reduce the problematic susceptibility of modern neural networks to small data perturbations.
Overfitting is a major concern in adversarial training despite being mostly absent in standard training.
We show that our resulting holistic robust (HR) training procedure yields SOTA performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work have demonstrated that robustness (to "corruption") can be at
odds with generalization. Adversarial training, for instance, aims to reduce
the problematic susceptibility of modern neural networks to small data
perturbations. Surprisingly, overfitting is a major concern in adversarial
training despite being mostly absent in standard training. We provide here
theoretical evidence for this peculiar "robust overfitting" phenomenon.
Subsequently, we advance a novel distributionally robust loss function bridging
robustness and generalization. We demonstrate both theoretically as well as
empirically the loss to enjoy a certified level of robustness against two
common types of corruption--data evasion and poisoning attacks--while ensuring
guaranteed generalization. We show through careful numerical experiments that
our resulting holistic robust (HR) training procedure yields SOTA performance.
Finally, we indicate that HR training can be interpreted as a direct extension
of adversarial training and comes with a negligible additional computational
burden. A ready-to-use python library implementing our algorithm is available
at https://github.com/RyanLucas3/HR_Neural_Networks.
Related papers
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.
We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z) - Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks [50.87615167799367]
We certify Graph Neural Networks (GNNs) against poisoning attacks, including backdoors, targeting the node features of a given graph.
Our framework provides fundamental insights into the role of graph structure and its connectivity on the worst-case behavior of convolution-based and PageRank-based GNNs.
arXiv Detail & Related papers (2024-07-15T16:12:51Z) - Better Representations via Adversarial Training in Pre-Training: A
Theoretical Perspective [30.871769067624836]
We show that feature purification plays an important role in connecting the adversarial robustness of the pre-trained model and the downstream tasks.
With purified nodes, it turns out that clean training is enough to achieve adversarial robustness in downstream tasks.
arXiv Detail & Related papers (2024-01-26T23:52:20Z) - Can overfitted deep neural networks in adversarial training generalize?
-- An approximation viewpoint [25.32729343174394]
Adrial training is a widely used method to improve the robustness of deep neural networks (DNNs) over adversarial perturbations.
In this paper, we provide a theoretical understanding of whether overfitted DNNs in adversarial training can generalize from an approximation viewpoint.
arXiv Detail & Related papers (2024-01-24T17:54:55Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Evolution of Neural Tangent Kernels under Benign and Adversarial
Training [109.07737733329019]
We study the evolution of the empirical Neural Tangent Kernel (NTK) under standard and adversarial training.
We find under adversarial training, the empirical NTK rapidly converges to a different kernel (and feature map) than standard training.
This new kernel provides adversarial robustness, even when non-robust training is performed on top of it.
arXiv Detail & Related papers (2022-10-21T15:21:15Z) - Robustness, Privacy, and Generalization of Adversarial Training [84.38148845727446]
This paper establishes and quantifies the privacy-robustness trade-off and generalization-robustness trade-off in adversarial training.
We show that adversarial training is $(varepsilon, delta)$-differentially private, where the magnitude of the differential privacy has a positive correlation with the robustified intensity.
Our generalization bounds do not explicitly rely on the parameter size which would be large in deep learning.
arXiv Detail & Related papers (2020-12-25T13:35:02Z) - On the Generalization Properties of Adversarial Training [21.79888306754263]
This paper studies the generalization performance of a generic adversarial training algorithm.
A series of numerical studies are conducted to demonstrate how the smoothness and L1 penalization help improve the adversarial robustness of models.
arXiv Detail & Related papers (2020-08-15T02:32:09Z) - Optimizing Information Loss Towards Robust Neural Networks [0.0]
Neural Networks (NNs) are vulnerable to adversarial examples.
We present a new training approach we call textitentropic retraining.
Based on an information-theoretic-inspired analysis, entropic retraining mimics the effects of adversarial training without the need of the laborious generation of adversarial examples.
arXiv Detail & Related papers (2020-08-07T10:12:31Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z) - Overfitting in adversarially robust deep learning [86.11788847990783]
We show that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training.
We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting.
arXiv Detail & Related papers (2020-02-26T15:40:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.