Skew Orthogonal Convolutions
- URL: http://arxiv.org/abs/2105.11417v1
- Date: Mon, 24 May 2021 17:11:44 GMT
- Title: Skew Orthogonal Convolutions
- Authors: Sahil Singla and Soheil Feizi
- Abstract summary: Training convolutional neural networks with a Lipschitz constraint under the $l_2$ norm is useful for provable adversarial robustness, interpretable gradients, stable training, etc.
Methodabv allows us to train provably Lipschitz, large convolutional neural networks significantly faster than prior works.
- Score: 44.053067014796596
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training convolutional neural networks with a Lipschitz constraint under the
$l_{2}$ norm is useful for provable adversarial robustness, interpretable
gradients, stable training, etc. While 1-Lipschitz networks can be designed by
imposing a 1-Lipschitz constraint on each layer, training such networks
requires each layer to be gradient norm preserving (GNP) to prevent gradients
from vanishing. However, existing GNP convolutions suffer from slow training,
lead to significant reduction in accuracy and provide no guarantees on their
approximations. In this work, we propose a GNP convolution layer called
\methodnamebold\ (\methodabv) that uses the following mathematical property:
when a matrix is {\it Skew-Symmetric}, its exponential function is an {\it
orthogonal} matrix. To use this property, we first construct a convolution
filter whose Jacobian is Skew-Symmetric. Then, we use the Taylor series
expansion of the Jacobian exponential to construct the \methodabv\ layer that
is orthogonal. To efficiently implement \methodabv, we keep a finite number of
terms from the Taylor series and provide a provable guarantee on the
approximation error. Our experiments on CIFAR-10 and CIFAR-100 show that
\methodabv\ allows us to train provably Lipschitz, large convolutional neural
networks significantly faster than prior works while achieving significant
improvements for both standard and certified robust accuracies.
Related papers
- LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers [0.0468732641979009]
We propose a layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees.
Our method Lip Kernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model.
We show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks.
arXiv Detail & Related papers (2024-10-29T17:20:14Z) - Spectral Norm of Convolutional Layers with Circular and Zero Paddings [55.233197272316275]
We generalize the use of the Gram iteration to zero padding convolutional layers and prove its quadratic convergence.
We also provide theorems for bridging the gap between circular and zero padding convolution's spectral norm.
arXiv Detail & Related papers (2024-01-31T23:48:48Z) - Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram
Iteration [122.51142131506639]
We introduce a precise, fast, and differentiable upper bound for the spectral norm of convolutional layers using circulant matrix theory.
We show through a comprehensive set of experiments that our approach outperforms other state-of-the-art methods in terms of precision, computational cost, and scalability.
It proves highly effective for the Lipschitz regularization of convolutional neural networks, with competitive results against concurrent approaches.
arXiv Detail & Related papers (2023-05-25T15:32:21Z) - Improved techniques for deterministic l2 robustness [63.34032156196848]
Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training.
We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer.
We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-11-15T19:10:12Z) - A Stochastic Proximal Method for Nonsmooth Regularized Finite Sum
Optimization [7.014966911550542]
We consider the problem of training a deep neural network with nonsmooth regularization to retrieve a sparse sub-structure.
We derive a new solver, called SR2, whose convergence and worst-case complexity are established without knowledge or approximation of the gradient's Lipschitz constant.
Experiments on network instances trained on CIFAR-10 and CIFAR-100 show that SR2 consistently achieves higher sparsity and accuracy than related methods such as ProxGEN and ProxSGD.
arXiv Detail & Related papers (2022-06-14T00:28:44Z) - Householder Activations for Provable Robustness against Adversarial
Attacks [37.289891549908596]
Training convolutional neural networks (CNNs) with a strict Lipschitz constraint under the l_2 norm is useful for provable adversarial robustness, interpretable gradients and stable training.
We introduce a class of nonlinear GNP activations with learnable Householder transformations called Householder activations.
Our experiments on CIFAR-10 and CIFAR-100 show that our regularized networks with $mathrmHH$ activations lead to significant improvements in both the standard and provable robust accuracy.
arXiv Detail & Related papers (2021-08-05T12:02:16Z) - Orthogonalizing Convolutional Layers with the Cayley Transform [83.73855414030646]
We propose and evaluate an alternative approach to parameterize convolutional layers that are constrained to be orthogonal.
We show that our method indeed preserves orthogonality to a high degree even for large convolutions.
arXiv Detail & Related papers (2021-04-14T23:54:55Z) - On Lipschitz Regularization of Convolutional Layers using Toeplitz
Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning.
computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard.
We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.