Related papers: Spectral Norm of Convolutional Layers with Circular and Zero Paddings

Spectral Norm of Convolutional Layers with Circular and Zero Paddings

URL: http://arxiv.org/abs/2402.00240v1
Date: Wed, 31 Jan 2024 23:48:48 GMT
Title: Spectral Norm of Convolutional Layers with Circular and Zero Paddings
Authors: Blaise Delattre and Quentin Barth\'elemy and Alexandre Allauzen
Abstract summary: We generalize the use of the Gram iteration to zero padding convolutional layers and prove its quadratic convergence. We also provide theorems for bridging the gap between circular and zero padding convolution's spectral norm.
Score: 55.233197272316275
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper leverages the use of \emph{Gram iteration} an efficient, deterministic, and differentiable method for computing spectral norm with an upper bound guarantee. Designed for circular convolutional layers, we generalize the use of the Gram iteration to zero padding convolutional layers and prove its quadratic convergence. We also provide theorems for bridging the gap between circular and zero padding convolution's spectral norm. We design a \emph{spectral rescaling} that can be used as a competitive $1$-Lipschitz layer that enhances network robustness. Demonstrated through experiments, our method outperforms state-of-the-art techniques in precision, computational cost, and scalability. The code of experiments is available at https://github.com/blaisedelattre/lip4conv.

Related papers

Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function [14.986031916712108]
We introduce a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique. We prove that this new technique converges with a numerical function at a noisy setting.
arXiv Detail & Related papers (2024-10-08T11:45:45Z)
Spectrum Extraction and Clipping for Implicitly Linear Layers [20.277446818410997]
We show the effectiveness of automatic differentiation in efficiently and correctly computing and controlling the spectrum of implicitly linear operators. We provide the first clipping method which is correct for general convolution layers.
arXiv Detail & Related papers (2024-02-25T07:28:28Z)
Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration [122.51142131506639]
We introduce a precise, fast, and differentiable upper bound for the spectral norm of convolutional layers using circulant matrix theory. We show through a comprehensive set of experiments that our approach outperforms other state-of-the-art methods in terms of precision, computational cost, and scalability. It proves highly effective for the Lipschitz regularization of convolutional neural networks, with competitive results against concurrent approaches.
arXiv Detail & Related papers (2023-05-25T15:32:21Z)
Improved techniques for deterministic l2 robustness [63.34032156196848]
Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training. We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer. We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-11-15T19:10:12Z)
Lassoed Tree Boosting [53.56229983630983]
We prove that a gradient boosted tree algorithm with early stopping faster than $n-1/4$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
arXiv Detail & Related papers (2022-05-22T00:34:41Z)
Learning Sparse Graph with Minimax Concave Penalty under Gaussian Markov Random Fields [51.07460861448716]
This paper presents a convex-analytic framework to learn from data. We show that a triangular convexity decomposition is guaranteed by a transform of the corresponding to its upper part.
arXiv Detail & Related papers (2021-09-17T17:46:12Z)
Skew Orthogonal Convolutions [44.053067014796596]
Training convolutional neural networks with a Lipschitz constraint under the $l_2$ norm is useful for provable adversarial robustness, interpretable gradients, stable training, etc. Methodabv allows us to train provably Lipschitz, large convolutional neural networks significantly faster than prior works.
arXiv Detail & Related papers (2021-05-24T17:11:44Z)
Orthogonalizing Convolutional Layers with the Cayley Transform [83.73855414030646]
We propose and evaluate an alternative approach to parameterize convolutional layers that are constrained to be orthogonal. We show that our method indeed preserves orthogonality to a high degree even for large convolutions.
arXiv Detail & Related papers (2021-04-14T23:54:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.