Deep Neural Networks with Trainable Activations and Controlled Lipschitz
Constant
- URL: http://arxiv.org/abs/2001.06263v2
- Date: Fri, 7 Aug 2020 13:27:44 GMT
- Title: Deep Neural Networks with Trainable Activations and Controlled Lipschitz
Constant
- Authors: Shayan Aziznejad, Harshit Gupta, Joaquim Campos, Michael Unser
- Abstract summary: We introduce a variational framework to learn the activation functions of deep neural networks.
Our aim is to increase the capacity of the network while controlling an upper-bound of the Lipschitz constant.
We numerically compare our scheme with standard ReLU network and its variations, PReLU and LeakyReLU.
- Score: 26.22495169129119
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a variational framework to learn the activation functions of
deep neural networks. Our aim is to increase the capacity of the network while
controlling an upper-bound of the actual Lipschitz constant of the input-output
relation. To that end, we first establish a global bound for the Lipschitz
constant of neural networks. Based on the obtained bound, we then formulate a
variational problem for learning activation functions. Our variational problem
is infinite-dimensional and is not computationally tractable. However, we prove
that there always exists a solution that has continuous and piecewise-linear
(linear-spline) activations. This reduces the original problem to a
finite-dimensional minimization where an l1 penalty on the parameters of the
activations favors the learning of sparse nonlinearities. We numerically
compare our scheme with standard ReLU network and its variations, PReLU and
LeakyReLU and we empirically demonstrate the practical aspects of our
framework.
Related papers
- Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Improving Lipschitz-Constrained Neural Networks by Learning Activation
Functions [14.378778606939665]
Lipschitz-constrained neural networks have several advantages over unconstrained ones and can be applied to a variety of problems.
We show that neural networks with learnable 1-Lipschitz linear splines are known to be more expressive.
Our numerical experiments show that our trained networks compare favorably with existing 1-Lipschitz neural architectures.
arXiv Detail & Related papers (2022-10-28T15:56:55Z) - Sparsest Univariate Learning Models Under Lipschitz Constraint [31.28451181040038]
We propose continuous-domain formulations for one-dimensional regression problems.
We control the Lipschitz constant explicitly using a user-defined upper-bound.
We show that both problems admit global minimizers that are continuous and piecewise-linear.
arXiv Detail & Related papers (2021-12-27T07:03:43Z) - Training Certifiably Robust Neural Networks with Efficient Local
Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications.
We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z) - Lipschitz Bounded Equilibrium Networks [3.2872586139884623]
This paper introduces new parameterizations of equilibrium neural networks, i.e. networks defined by implicit equations.
The new parameterization admits a Lipschitz bound during training via unconstrained optimization.
In image classification experiments we show that the Lipschitz bounds are very accurate and improve robustness to adversarial attacks.
arXiv Detail & Related papers (2020-10-05T01:00:40Z) - On Sparsity in Overparametrised Shallow ReLU Networks [42.33056643582297]
We study the ability of different regularisation strategies to capture solutions requiring only a finite amount of neurons, even on the infinitely wide regime.
We establish that both schemes are minimised by functions having only a finite number of neurons, irrespective of the amount of overparametrisation.
arXiv Detail & Related papers (2020-06-18T01:35:26Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - On Lipschitz Regularization of Convolutional Layers using Toeplitz
Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning.
computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard.
We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.