Towards Certifying $\ell_\infty$ Robustness using Neural Networks with
$\ell_\infty$-dist Neurons
- URL: http://arxiv.org/abs/2102.05363v1
- Date: Wed, 10 Feb 2021 10:03:58 GMT
- Title: Towards Certifying $\ell_\infty$ Robustness using Neural Networks with
$\ell_\infty$-dist Neurons
- Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang
- Abstract summary: We develop a principled neural network that inherently resists $ell_infty$ perturbations.
We consistently achieve state-of-the-art performance on commonly used datasets.
- Score: 27.815886593870076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well-known that standard neural networks, even with a high
classification accuracy, are vulnerable to small $\ell_\infty$-norm bounded
adversarial perturbations. Although many attempts have been made, most previous
works either can only provide empirical verification of the defense to a
particular attack method, or can only develop a certified guarantee of the
model robustness in limited scenarios. In this paper, we seek for a new
approach to develop a theoretically principled neural network that inherently
resists $\ell_\infty$ perturbations. In particular, we design a novel neuron
that uses $\ell_\infty$-distance as its basic operation (which we call
$\ell_\infty$-dist neuron), and show that any neural network constructed with
$\ell_\infty$-dist neurons (called $\ell_{\infty}$-dist net) is naturally a
1-Lipschitz function with respect to $\ell_\infty$-norm. This directly provides
a rigorous guarantee of the certified robustness based on the margin of
prediction outputs. We also prove that such networks have enough expressive
power to approximate any 1-Lipschitz function with robust generalization
guarantee. Our experimental results show that the proposed network is
promising. Using $\ell_{\infty}$-dist nets as the basic building blocks, we
consistently achieve state-of-the-art performance on commonly used datasets:
93.09% certified accuracy on MNIST ($\epsilon=0.3$), 79.23% on Fashion MNIST
($\epsilon=0.1$) and 35.10% on CIFAR-10 ($\epsilon=8/255$).
Related papers
- Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements.
We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z) - Bayesian Inference with Deep Weakly Nonlinear Networks [57.95116787699412]
We show at a physics level of rigor that Bayesian inference with a fully connected neural network is solvable.
We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature.
arXiv Detail & Related papers (2024-05-26T17:08:04Z) - Generalization Ability of Wide Neural Networks on $\mathbb{R}$ [8.508360765158326]
We study the generalization ability of the wide two-layer ReLU neural network on $mathbbR$.
We show that: $i)$ when the width $mrightarrowinfty$, the neural network kernel (NNK) uniformly converges to the NTK; $ii)$ the minimax rate of regression over the RKHS associated to $K_1$ is $n-2/3$; $iii)$ if one adopts the early stopping strategy in training a wide neural network, the resulting neural network achieves the minimax rate; $iv
arXiv Detail & Related papers (2023-02-12T15:07:27Z) - The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich
Regimes [75.59720049837459]
We study the transition from infinite-width behavior to this variance limited regime as a function of sample size $P$ and network width $N$.
We find that finite-size effects can become relevant for very small datasets on the order of $P* sim sqrtN$ for regression with ReLU networks.
arXiv Detail & Related papers (2022-12-23T04:48:04Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Coin Flipping Neural Networks [8.009932864430901]
We show that neural networks with access to randomness can outperform deterministic networks by using amplification.
We conjecture that for most classification problems, there is a CFNN which solves them with higher accuracy or fewer neurons than any deterministic network.
arXiv Detail & Related papers (2022-06-18T11:19:44Z) - Scalable Lipschitz Residual Networks with Convex Potential Flows [120.27516256281359]
We show that using convex potentials in a residual network gradient flow provides a built-in $1$-Lipschitz transformation.
A comprehensive set of experiments on CIFAR-10 demonstrates the scalability of our architecture and the benefit of our approach for $ell$ provable defenses.
arXiv Detail & Related papers (2021-10-25T07:12:53Z) - Boosting Certified $\ell_\infty$ Robustness with EMA Method and Ensemble
Model [0.0]
We introduce the EMA method to improve the training process of a $ell_infty$-norm neural network.
Considering the randomness of the training algorithm, we propose an ensemble method based on trained base models with the $1$-Lipschitz property.
We give the theoretical analysis of the ensemble method based on the $1$-Lipschitz property on the certified robustness, which ensures the effectiveness and stability of the algorithm.
arXiv Detail & Related papers (2021-07-01T06:01:12Z) - Almost Tight L0-norm Certified Robustness of Top-k Predictions against
Adversarial Perturbations [78.23408201652984]
Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches.
Our work is based on randomized smoothing, which builds a provably robust classifier via randomizing an input.
For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.
arXiv Detail & Related papers (2020-11-15T21:34:44Z) - Shuffling Recurrent Neural Networks [97.72614340294547]
We propose a novel recurrent neural network model, where the hidden state $h_t$ is obtained by permuting the vector elements of the previous hidden state $h_t-1$.
In our model, the prediction is given by a second learned function, which is applied to the hidden state $s(h_t)$.
arXiv Detail & Related papers (2020-07-14T19:36:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.