Related papers: Upper and lower bounds for the Lipschitz constant of random neural networks

Upper and lower bounds for the Lipschitz constant of random neural networks

URL: http://arxiv.org/abs/2311.01356v3
Date: Thu, 18 Jan 2024 14:39:26 GMT
Title: Upper and lower bounds for the Lipschitz constant of random neural networks
Authors: Paul Geuchen, Thomas Heindl, Dominik St\"oger, Felix Voigtlaender
Abstract summary: We study upper and lower bounds for the Lipschitz constant of random ReLU neural networks. For shallow neural networks, we characterize the Lipschitz constant up to an absolute numerical constant. For deep networks with fixed depth and sufficiently large width, our established upper bound is larger than the lower bound by a factor that is logarithmic in the width.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Empirical studies have widely demonstrated that neural networks are highly sensitive to small, adversarial perturbations of the input. The worst-case robustness against these so-called adversarial examples can be quantified by the Lipschitz constant of the neural network. In this paper, we study upper and lower bounds for the Lipschitz constant of random ReLU neural networks. Specifically, we assume that the weights and biases follow a generalization of the He initialization, where general symmetric distributions for the biases are permitted. For shallow neural networks, we characterize the Lipschitz constant up to an absolute numerical constant. For deep networks with fixed depth and sufficiently large width, our established upper bound is larger than the lower bound by a factor that is logarithmic in the width.

Related papers

A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff [57.25901375384457]
We propose a nonasymptotic generalization theory for multilayer neural networks with arbitrary Lipschitz activations and general Lipschitz loss functions. In particular, it doens't require the boundness of loss function, as commonly assumed in the literature. We show the near minimax optimality of our theory for multilayer ReLU networks for regression problems.
arXiv Detail & Related papers (2025-03-03T23:34:12Z)
Norm-based Generalization Bounds for Compositionally Sparse Neural Networks [11.987589603961622]
We prove generalization bounds for multilayered sparse ReLU neural networks, including convolutional neural networks. Taken together, these results suggest that compositional sparsity of the underlying target function is critical to the success of deep neural networks.
arXiv Detail & Related papers (2023-01-28T00:06:22Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK) In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks. We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z)
The Many Faces of 1-Lipschitz Neural Networks [1.911678487931003]
We show that 1-Lipschitz neural network can fit arbitrarily difficult frontier making them as expressive as classical ones. We also study the link between classification with 1-Lipschitz network and optimal transport thanks to regularized versions of Kantorovich-Rubinstein duality theory.
arXiv Detail & Related papers (2021-04-11T20:31:32Z)
CLIP: Cheap Lipschitz Training of Neural Networks [0.0]
We investigate a variational regularization method named CLIP for controlling the Lipschitz constant of a neural network. We mathematically analyze the proposed model, in particular discussing the impact of the chosen regularization parameter on the output of the network.
arXiv Detail & Related papers (2021-03-23T13:29:24Z)
Lipschitz Bounded Equilibrium Networks [3.2872586139884623]
This paper introduces new parameterizations of equilibrium neural networks, i.e. networks defined by implicit equations. The new parameterization admits a Lipschitz bound during training via unconstrained optimization. In image classification experiments we show that the Lipschitz bounds are very accurate and improve robustness to adversarial attacks.
arXiv Detail & Related papers (2020-10-05T01:00:40Z)
On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning. computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard. We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z)
Lipschitz constant estimation of Neural Networks via sparse polynomial optimization [47.596834444042685]
LiPopt is a framework for computing increasingly tighter upper bounds on the Lipschitz constant of neural networks. We show how to use the sparse connectivity of a network, to significantly reduce the complexity. We conduct experiments on networks with random weights as well as networks trained on MNIST.
arXiv Detail & Related papers (2020-04-18T18:55:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.