A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness
- URL: http://arxiv.org/abs/2404.09821v1
- Date: Mon, 15 Apr 2024 14:21:01 GMT
- Title: A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness
- Authors: Yuri Kinoshita, Taro Toyoizumi,
- Abstract summary: We propose a novel framework for bi-Lipschitzness based on convex neural networks and the Legendre-Fenchel duality.
Our framework can achieve such a clear and tight control based on convex neural networks and the Legendre-Fenchel duality.
- Score: 2.3020018305241337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While neural networks can enjoy an outstanding flexibility and exhibit unprecedented performance, the mechanism behind their behavior is still not well-understood. To tackle this fundamental challenge, researchers have tried to restrict and manipulate some of their properties in order to gain new insights and better control on them. Especially, throughout the past few years, the concept of \emph{bi-Lipschitzness} has been proved as a beneficial inductive bias in many areas. However, due to its complexity, the design and control of bi-Lipschitz architectures are falling behind, and a model that is precisely designed for bi-Lipschitzness realizing a direct and simple control of the constants along with solid theoretical analysis is lacking. In this work, we investigate and propose a novel framework for bi-Lipschitzness that can achieve such a clear and tight control based on convex neural networks and the Legendre-Fenchel duality. Its desirable properties are illustrated with concrete experiments. We also apply this framework to uncertainty estimation and monotone problem settings to illustrate its broad range of applications.
Related papers
- A Lipschitz spaces view of infinitely wide shallow neural networks [3.0017241250121387]
We revisit the mean field parametrization of shallow neural networks, using signed measures on parameter spaces and duality pairings.
We prove a compactness result in strong Kantorovich-Rubinstein norm, and in the absence of which we show several examples demonstrating undesirable behavior.
arXiv Detail & Related papers (2024-10-18T16:41:37Z) - A General Framework of the Consistency for Large Neural Networks [0.0]
We propose a general regularization framework to study the Mean Integrated Squared Error (MISE) of neural networks.
Based on our frameworks, we find the MISE curve has two possible shapes, namely the shape of double descents and monotone decreasing.
arXiv Detail & Related papers (2024-09-21T12:25:44Z) - The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks.
We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z) - A Unified Algebraic Perspective on Lipschitz Neural Networks [88.14073994459586]
This paper introduces a novel perspective unifying various types of 1-Lipschitz neural networks.
We show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition.
Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers.
arXiv Detail & Related papers (2023-03-06T14:31:09Z) - Some Fundamental Aspects about Lipschitz Continuity of Neural Networks [6.576051895863941]
Lipschitz continuity is a crucial functional property of any predictive model.
We examine and characterise the Lipschitz behaviour of Neural Networks.
We show a remarkable fidelity of the lower Lipschitz bound, identify a striking Double Descent trend in both upper and lower bounds to the Lipschitz and explain the intriguing effects of label noise on function smoothness and generalisation.
arXiv Detail & Related papers (2023-02-21T18:59:40Z) - On the Lipschitz Constant of Deep Networks and Double Descent [5.381801249240512]
Existing bounds on the generalization error of deep networks assume some form of smooth or bounded dependence on the input variable.
We present an extensive experimental study of the empirical Lipschitz constant of deep networks undergoing double descent.
arXiv Detail & Related papers (2023-01-28T23:22:49Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm.
We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees.
We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z) - Scalable Lipschitz Residual Networks with Convex Potential Flows [120.27516256281359]
We show that using convex potentials in a residual network gradient flow provides a built-in $1$-Lipschitz transformation.
A comprehensive set of experiments on CIFAR-10 demonstrates the scalability of our architecture and the benefit of our approach for $ell$ provable defenses.
arXiv Detail & Related papers (2021-10-25T07:12:53Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Lipschitz Bounded Equilibrium Networks [3.2872586139884623]
This paper introduces new parameterizations of equilibrium neural networks, i.e. networks defined by implicit equations.
The new parameterization admits a Lipschitz bound during training via unconstrained optimization.
In image classification experiments we show that the Lipschitz bounds are very accurate and improve robustness to adversarial attacks.
arXiv Detail & Related papers (2020-10-05T01:00:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.