Improving Lipschitz-Constrained Neural Networks by Learning Activation
Functions
- URL: http://arxiv.org/abs/2210.16222v2
- Date: Tue, 19 Dec 2023 17:19:49 GMT
- Title: Improving Lipschitz-Constrained Neural Networks by Learning Activation
Functions
- Authors: Stanislas Ducotterd, Alexis Goujon, Pakshal Bohra, Dimitris Perdios,
Sebastian Neumayer, Michael Unser
- Abstract summary: Lipschitz-constrained neural networks have several advantages over unconstrained ones and can be applied to a variety of problems.
We show that neural networks with learnable 1-Lipschitz linear splines are known to be more expressive.
Our numerical experiments show that our trained networks compare favorably with existing 1-Lipschitz neural architectures.
- Score: 14.378778606939665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lipschitz-constrained neural networks have several advantages over
unconstrained ones and can be applied to a variety of problems, making them a
topic of attention in the deep learning community. Unfortunately, it has been
shown both theoretically and empirically that they perform poorly when equipped
with ReLU activation functions. By contrast, neural networks with learnable
1-Lipschitz linear splines are known to be more expressive. In this paper, we
show that such networks correspond to global optima of a constrained functional
optimization problem that consists of the training of a neural network composed
of 1-Lipschitz linear layers and 1-Lipschitz freeform activation functions with
second-order total-variation regularization. Further, we propose an efficient
method to train these neural networks. Our numerical experiments show that our
trained networks compare favorably with existing 1-Lipschitz neural
architectures.
Related papers
- LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks.
We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - A Unified Algebraic Perspective on Lipschitz Neural Networks [88.14073994459586]
This paper introduces a novel perspective unifying various types of 1-Lipschitz neural networks.
We show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition.
Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers.
arXiv Detail & Related papers (2023-03-06T14:31:09Z) - Exploring the Approximation Capabilities of Multiplicative Neural
Networks for Smooth Functions [9.936974568429173]
We consider two classes of target functions: generalized bandlimited functions and Sobolev-Type balls.
Our results demonstrate that multiplicative neural networks can approximate these functions with significantly fewer layers and neurons.
These findings suggest that multiplicative gates can outperform standard feed-forward layers and have potential for improving neural network design.
arXiv Detail & Related papers (2023-01-11T17:57:33Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency.
Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z) - Approximation of Lipschitz Functions using Deep Spline Neural Networks [21.13606355641886]
We propose to use learnable spline activation functions with at least 3 linear regions instead of ReLU networks.
We prove that this choice is optimal among all component-wise $1$-Lipschitz activation functions.
This choice is at least as expressive as the recently introduced non component-wise Groupsort activation function for spectral-norm-constrained weights.
arXiv Detail & Related papers (2022-04-13T08:07:28Z) - Training Certifiably Robust Neural Networks with Efficient Local
Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications.
We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z) - The Many Faces of 1-Lipschitz Neural Networks [1.911678487931003]
We show that 1-Lipschitz neural network can fit arbitrarily difficult frontier making them as expressive as classical ones.
We also study the link between classification with 1-Lipschitz network and optimal transport thanks to regularized versions of Kantorovich-Rubinstein duality theory.
arXiv Detail & Related papers (2021-04-11T20:31:32Z) - On Lipschitz Regularization of Convolutional Layers using Toeplitz
Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning.
computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard.
We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z) - A Deep Conditioning Treatment of Neural Networks [37.192369308257504]
We show that depth improves trainability of neural networks by improving the conditioning of certain kernel matrices of the input data.
We provide versions of the result that hold for training just the top layer of the neural network, as well as for training all layers via the neural tangent kernel.
arXiv Detail & Related papers (2020-02-04T20:21:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.