1-Lipschitz Neural Networks are more expressive with N-Activations
- URL: http://arxiv.org/abs/2311.06103v2
- Date: Mon, 3 Jun 2024 15:20:13 GMT
- Title: 1-Lipschitz Neural Networks are more expressive with N-Activations
- Authors: Bernd Prach, Christoph H. Lampert,
- Abstract summary: Small changes to a system's inputs should not result in large changes to its outputs.
We show that commonly used activation functions, such as MaxMin, unnecessarily restrict the class of representable functions.
We introduce the new N-activation function that is provably more expressive than currently popular activation functions.
- Score: 19.858602457988194
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A crucial property for achieving secure, trustworthy and interpretable deep learning systems is their robustness: small changes to a system's inputs should not result in large changes to its outputs. Mathematically, this means one strives for networks with a small Lipschitz constant. Several recent works have focused on how to construct such Lipschitz networks, typically by imposing constraints on the weight matrices. In this work, we study an orthogonal aspect, namely the role of the activation function. We show that commonly used activation functions, such as MaxMin, as well as all piece-wise linear ones with two segments unnecessarily restrict the class of representable functions, even in the simplest one-dimensional setting. We furthermore introduce the new N-activation function that is provably more expressive than currently popular activation functions. We provide code at https://github.com/berndprach/NActivation.
Related papers
- Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted
Activations [52.031701581294804]
Lipschitz bounds for neural networks can be computed with upper time preservation guarantees.
Our paper bridges the gap and extends Lipschitz beyond slope-restricted activation functions.
Our proposed analysis is general and provides a unified approach for estimating $ell$ and $ell_infty$ Lipschitz bounds.
arXiv Detail & Related papers (2024-01-25T09:23:31Z) - STL: A Signed and Truncated Logarithm Activation Function for Neural
Networks [5.9622541907827875]
Activation functions play an essential role in neural networks.
We present a novel signed and truncated logarithm function as activation function.
The suggested activation function can be applied in a large range of neural networks.
arXiv Detail & Related papers (2023-07-31T03:41:14Z) - Regularization of polynomial networks for image recognition [78.4786845859205]
Polynomial Networks (PNs) have emerged as an alternative method with a promising performance and improved interpretability.
We introduce a class of PNs, which are able to reach the performance of ResNet across a range of six benchmarks.
arXiv Detail & Related papers (2023-03-24T10:05:22Z) - A Unified Algebraic Perspective on Lipschitz Neural Networks [88.14073994459586]
This paper introduces a novel perspective unifying various types of 1-Lipschitz neural networks.
We show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition.
Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers.
arXiv Detail & Related papers (2023-03-06T14:31:09Z) - Data-aware customization of activation functions reduces neural network
error [0.35172332086962865]
We show that data-aware customization of activation functions can result in striking reductions in neural network error.
A simple substitution with the seagull'' activation function in an already-refined neural network can lead to an order-of-magnitude reduction in error.
arXiv Detail & Related papers (2023-01-16T23:38:37Z) - Constrained Monotonic Neural Networks [0.685316573653194]
Wider adoption of neural networks in many critical domains such as finance and healthcare is being hindered by the need to explain their predictions.
Monotonicity constraint is one of the most requested properties in real-world scenarios.
We show it can approximate any continuous monotone function on a compact subset of $mathbbRn$.
arXiv Detail & Related papers (2022-05-24T04:26:10Z) - Approximation of Lipschitz Functions using Deep Spline Neural Networks [21.13606355641886]
We propose to use learnable spline activation functions with at least 3 linear regions instead of ReLU networks.
We prove that this choice is optimal among all component-wise $1$-Lipschitz activation functions.
This choice is at least as expressive as the recently introduced non component-wise Groupsort activation function for spectral-norm-constrained weights.
arXiv Detail & Related papers (2022-04-13T08:07:28Z) - Training Certifiably Robust Neural Networks with Efficient Local
Lipschitz Bounds [99.23098204458336]
Certified robustness is a desirable property for deep neural networks in safety-critical applications.
We show that our method consistently outperforms state-of-the-art methods on MNIST and TinyNet datasets.
arXiv Detail & Related papers (2021-11-02T06:44:10Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - On Lipschitz Regularization of Convolutional Layers using Toeplitz
Matrix Theory [77.18089185140767]
Lipschitz regularity is established as a key property of modern deep learning.
computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard.
We introduce a new upper bound for convolutional layers that is both tight and easy to compute.
arXiv Detail & Related papers (2020-06-15T13:23:34Z) - Deep Neural Networks with Trainable Activations and Controlled Lipschitz
Constant [26.22495169129119]
We introduce a variational framework to learn the activation functions of deep neural networks.
Our aim is to increase the capacity of the network while controlling an upper-bound of the Lipschitz constant.
We numerically compare our scheme with standard ReLU network and its variations, PReLU and LeakyReLU.
arXiv Detail & Related papers (2020-01-17T12:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.