Related papers: Improved techniques for deterministic l2 robustness

Improved techniques for deterministic l2 robustness

URL: http://arxiv.org/abs/2211.08453v1
Date: Tue, 15 Nov 2022 19:10:12 GMT
Title: Improved techniques for deterministic l2 robustness
Authors: Sahil Singla, Soheil Feizi
Abstract summary: Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training. We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer. We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
Score: 63.34032156196848
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_{2}$ norm is useful for adversarial robustness, interpretable gradients and stable training. 1-Lipschitz CNNs are usually designed by enforcing each layer to have an orthogonal Jacobian matrix (for all inputs) to prevent the gradients from vanishing during backpropagation. However, their performance often significantly lags behind that of heuristic methods to enforce Lipschitz constraints where the resulting CNN is not \textit{provably} 1-Lipschitz. In this work, we reduce this gap by introducing (a) a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer MLP that significantly improves their performance for both standard and provably robust accuracy, (b) a method to significantly reduce the training time per epoch for Skew Orthogonal Convolution (SOC) layers (>30\% reduction for deeper networks) and (c) a class of pooling layers using the mathematical property that the $l_{2}$ distance of an input to a manifold is 1-Lipschitz. Using these methods, we significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 (gains of +1.79\% and +3.82\%) and similarly on CIFAR-100 (+3.78\% and +4.75\%) across all networks. Code is available at \url{https://github.com/singlasahil14/improved_l2_robustness}.

Related papers

A Proximal Algorithm for Network Slimming [2.8148957592979427]
A popular channel pruning method for convolutional neural networks (CNNs) uses subgradient descent to train CNNs. We develop an alternative algorithm called proximal NS to train CNNs towards sparse, accurate structures. Our experiments demonstrate that after one round of training, proximal NS yields a CNN with competitive accuracy and compression.
arXiv Detail & Related papers (2023-07-02T23:34:12Z)
Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks [23.46030810336596]
We propose a new technique for constructing deep networks with a small Lipschitz constant. It provides formal guarantees on the Lipschitz constant, it is easy to implement and efficient to run, and it can be combined with any training objective and optimization method. Experiments and ablation studies in the context of image classification with certified robust accuracy confirm that AOL layers achieve results that are on par with most existing methods.
arXiv Detail & Related papers (2022-08-05T13:34:33Z)
Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications. We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z)
Scalable Lipschitz Residual Networks with Convex Potential Flows [120.27516256281359]
We show that using convex potentials in a residual network gradient flow provides a built-in $1$-Lipschitz transformation. A comprehensive set of experiments on CIFAR-10 demonstrates the scalability of our architecture and the benefit of our approach for $ell$ provable defenses.
arXiv Detail & Related papers (2021-10-25T07:12:53Z)
Householder Activations for Provable Robustness against Adversarial Attacks [37.289891549908596]
Training convolutional neural networks (CNNs) with a strict Lipschitz constraint under the l_2 norm is useful for provable adversarial robustness, interpretable gradients and stable training. We introduce a class of nonlinear GNP activations with learnable Householder transformations called Householder activations. Our experiments on CIFAR-10 and CIFAR-100 show that our regularized networks with $mathrmHH$ activations lead to significant improvements in both the standard and provable robust accuracy.
arXiv Detail & Related papers (2021-08-05T12:02:16Z)
Skew Orthogonal Convolutions [44.053067014796596]
Training convolutional neural networks with a Lipschitz constraint under the $l_2$ norm is useful for provable adversarial robustness, interpretable gradients, stable training, etc. Methodabv allows us to train provably Lipschitz, large convolutional neural networks significantly faster than prior works.
arXiv Detail & Related papers (2021-05-24T17:11:44Z)
Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data. Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z)
Large Norms of CNN Layers Do Not Hurt Adversarial Robustness [11.930096161524407]
Lipschitz properties of convolutional neural networks (CNNs) are widely considered to be related to adversarial robustness. We propose a novel regularization method termed norm decay, which can effectively reduce the norms of convolutional layers and fully-connected layers. Experiments show that norm-regularization methods, including norm decay, weight decay, and singular value clipping, can improve generalization of CNNs.
arXiv Detail & Related papers (2020-09-17T17:33:50Z)
On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning. computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard. We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.