Related papers: Skew Orthogonal Convolutions

Skew Orthogonal Convolutions

URL: http://arxiv.org/abs/2105.11417v1
Date: Mon, 24 May 2021 17:11:44 GMT
Title: Skew Orthogonal Convolutions
Authors: Sahil Singla and Soheil Feizi
Abstract summary: Training convolutional neural networks with a Lipschitz constraint under the $l_2$ norm is useful for provable adversarial robustness, interpretable gradients, stable training, etc. Methodabv allows us to train provably Lipschitz, large convolutional neural networks significantly faster than prior works.
Score: 44.053067014796596
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training convolutional neural networks with a Lipschitz constraint under the $l_{2}$ norm is useful for provable adversarial robustness, interpretable gradients, stable training, etc. While 1-Lipschitz networks can be designed by imposing a 1-Lipschitz constraint on each layer, training such networks requires each layer to be gradient norm preserving (GNP) to prevent gradients from vanishing. However, existing GNP convolutions suffer from slow training, lead to significant reduction in accuracy and provide no guarantees on their approximations. In this work, we propose a GNP convolution layer called \methodnamebold\ (\methodabv) that uses the following mathematical property: when a matrix is {\it Skew-Symmetric}, its exponential function is an {\it orthogonal} matrix. To use this property, we first construct a convolution filter whose Jacobian is Skew-Symmetric. Then, we use the Taylor series expansion of the Jacobian exponential to construct the \methodabv\ layer that is orthogonal. To efficiently implement \methodabv, we keep a finite number of terms from the Taylor series and provide a provable guarantee on the approximation error. Our experiments on CIFAR-10 and CIFAR-100 show that \methodabv\ allows us to train provably Lipschitz, large convolutional neural networks significantly faster than prior works while achieving significant improvements for both standard and certified robust accuracies.

Related papers

LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers [0.0468732641979009]
We propose a layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees. Our method Lip Kernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model. We show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks.
arXiv Detail & Related papers (2024-10-29T17:20:14Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Spectral Norm of Convolutional Layers with Circular and Zero Paddings [55.233197272316275]
We generalize the use of the Gram iteration to zero padding convolutional layers and prove its quadratic convergence. We also provide theorems for bridging the gap between circular and zero padding convolution's spectral norm.
arXiv Detail & Related papers (2024-01-31T23:48:48Z)
Efficient uniform approximation using Random Vector Functional Link networks [0.0]
A Random Vector Functional Link (RVFL) network is rooted neural network with random inner biases.<n>In this paper, we show that an RVFL with hidden weights activation can function in $L_in.<n>We give a nonasymptlayer nodes to achieve a given accuracy with high probability.
arXiv Detail & Related papers (2023-06-30T09:25:03Z)
Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration [122.51142131506639]
We introduce a precise, fast, and differentiable upper bound for the spectral norm of convolutional layers using circulant matrix theory. We show through a comprehensive set of experiments that our approach outperforms other state-of-the-art methods in terms of precision, computational cost, and scalability. It proves highly effective for the Lipschitz regularization of convolutional neural networks, with competitive results against concurrent approaches.
arXiv Detail & Related papers (2023-05-25T15:32:21Z)
Improved techniques for deterministic l2 robustness [63.34032156196848]
Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training. We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer. We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-11-15T19:10:12Z)
A Stochastic Proximal Method for Nonsmooth Regularized Finite Sum Optimization [7.014966911550542]
We consider the problem of training a deep neural network with nonsmooth regularization to retrieve a sparse sub-structure. We derive a new solver, called SR2, whose convergence and worst-case complexity are established without knowledge or approximation of the gradient's Lipschitz constant. Experiments on network instances trained on CIFAR-10 and CIFAR-100 show that SR2 consistently achieves higher sparsity and accuracy than related methods such as ProxGEN and ProxSGD.
arXiv Detail & Related papers (2022-06-14T00:28:44Z)
Householder Activations for Provable Robustness against Adversarial Attacks [37.289891549908596]
Training convolutional neural networks (CNNs) with a strict Lipschitz constraint under the l_2 norm is useful for provable adversarial robustness, interpretable gradients and stable training. We introduce a class of nonlinear GNP activations with learnable Householder transformations called Householder activations. Our experiments on CIFAR-10 and CIFAR-100 show that our regularized networks with $mathrmHH$ activations lead to significant improvements in both the standard and provable robust accuracy.
arXiv Detail & Related papers (2021-08-05T12:02:16Z)
Orthogonalizing Convolutional Layers with the Cayley Transform [83.73855414030646]
We propose and evaluate an alternative approach to parameterize convolutional layers that are constrained to be orthogonal. We show that our method indeed preserves orthogonality to a high degree even for large convolutions.
arXiv Detail & Related papers (2021-04-14T23:54:55Z)
On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning. computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard. We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.