LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified
Robustness
- URL: http://arxiv.org/abs/2210.11620v2
- Date: Mon, 27 Mar 2023 01:19:33 GMT
- Title: LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified
Robustness
- Authors: Xiaojun Xu, Linyi Li, Bo Li
- Abstract summary: Recent studies show that training deep neural networks (DNNs) with Lipschitz constraints are able to enhance adversarial robustness and other model properties such as stability.
We propose a layer-wise orthogonal training method (LOT) to effectively train 1-Lipschitz convolution layers.
We show that LOT significantly outperforms baselines regarding deterministic l2 certified robustness, and scales to deeper neural networks.
- Score: 14.206377940235091
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies show that training deep neural networks (DNNs) with Lipschitz
constraints are able to enhance adversarial robustness and other model
properties such as stability. In this paper, we propose a layer-wise orthogonal
training method (LOT) to effectively train 1-Lipschitz convolution layers via
parametrizing an orthogonal matrix with an unconstrained matrix. We then
efficiently compute the inverse square root of a convolution kernel by
transforming the input domain to the Fourier frequency domain. On the other
hand, as existing works show that semi-supervised training helps improve
empirical robustness, we aim to bridge the gap and prove that semi-supervised
learning also improves the certified robustness of Lipschitz-bounded models. We
conduct comprehensive evaluations for LOT under different settings. We show
that LOT significantly outperforms baselines regarding deterministic l2
certified robustness, and scales to deeper neural networks. Under the
supervised scenario, we improve the state-of-the-art certified robustness for
all architectures (e.g. from 59.04% to 63.50% on CIFAR-10 and from 32.57% to
34.59% on CIFAR-100 at radius rho = 36/255 for 40-layer networks). With
semi-supervised learning over unlabelled data, we are able to improve
state-of-the-art certified robustness on CIFAR-10 at rho = 108/255 from 36.04%
to 42.39%. In addition, LOT consistently outperforms baselines on different
model architectures with only 1/3 evaluation time.
Related papers
- A Coefficient Makes SVRG Effective [55.104068027239656]
Variance Reduced Gradient (SVRG) is a theoretically compelling optimization method.
In this work, we demonstrate the potential of SVRG in optimizing real-world neural networks.
Our analysis finds that, for deeper networks, the strength of the variance reduction term in SVRG should be smaller and decrease as training progresses.
arXiv Detail & Related papers (2023-11-09T18:47:44Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Improved techniques for deterministic l2 robustness [63.34032156196848]
Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training.
We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer.
We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-11-15T19:10:12Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - Can pruning improve certified robustness of neural networks? [106.03070538582222]
We show that neural network pruning can improve empirical robustness of deep neural networks (NNs)
Our experiments show that by appropriately pruning an NN, its certified accuracy can be boosted up to 8.2% under standard training.
We additionally observe the existence of certified lottery tickets that can match both standard and certified robust accuracies of the original dense models.
arXiv Detail & Related papers (2022-06-15T05:48:51Z) - Training Certifiably Robust Neural Networks with Efficient Local
Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications.
We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z) - Robust Learning via Persistency of Excitation [4.674053902991301]
We show that network training using gradient descent is equivalent to a dynamical system parameter estimation problem.
We provide an efficient technique for estimating the corresponding Lipschitz constant using extreme value theory.
Our approach also universally increases the adversarial accuracy by 0.1% to 0.3% points in various state-of-the-art adversarially trained models.
arXiv Detail & Related papers (2021-06-03T18:49:05Z) - Second-Order Provable Defenses against Adversarial Attacks [63.34032156196848]
We show that if the eigenvalues of the network are bounded, we can compute a certificate in the $l$ norm efficiently using convex optimization.
We achieve certified accuracy of 5.78%, and 44.96%, and 43.19% on 2,59% and 4BP-based methods respectively.
arXiv Detail & Related papers (2020-06-01T05:55:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.