Complexity of Deciding Injectivity and Surjectivity of ReLU Neural Networks
- URL: http://arxiv.org/abs/2405.19805v1
- Date: Thu, 30 May 2024 08:14:34 GMT
- Title: Complexity of Deciding Injectivity and Surjectivity of ReLU Neural Networks
- Authors: Vincent Froese, Moritz Grillo, Martin Skutella,
- Abstract summary: We prove coNP-completeness of deciding injectivity of a ReLU layer.
We also characterize surjectivity for two-layer ReLU networks with one-dimensional output.
- Score: 5.482420806459269
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks with ReLU activation play a key role in modern machine learning. In view of safety-critical applications, the verification of trained networks is of great importance and necessitates a thorough understanding of essential properties of the function computed by a ReLU network, including characteristics like injectivity and surjectivity. Recently, Puthawala et al. [JMLR 2022] came up with a characterization for injectivity of a ReLU layer, which implies an exponential time algorithm. However, the exact computational complexity of deciding injectivity remained open. We answer this question by proving coNP-completeness of deciding injectivity of a ReLU layer. On the positive side, as our main result, we present a parameterized algorithm which yields fixed-parameter tractability of the problem with respect to the input dimension. In addition, we also characterize surjectivity for two-layer ReLU networks with one-dimensional output. Remarkably, the decision problem turns out to be the complement of a basic network verification task. We prove NP-hardness for surjectivity, implying a stronger hardness result than previously known for the network verification problem. Finally, we reveal interesting connections to computational convexity by formulating the surjectivity problem as a zonotope containment problem
Related papers
- Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations [5.731640425517324]
We show that under certain conditions, the residual loss of PINNs can be globally minimized by a wide neural network.
An activation function with well-behaved high-order derivatives plays a crucial role in minimizing the residual loss.
The established theory paves the way for designing and choosing effective activation functions for PINNs.
arXiv Detail & Related papers (2024-05-02T19:08:59Z) - Robustness Verifcation in Neural Networks [0.0]
We investigate formal verification problems for Neural Network computations.
One question is whether there do exist valid inputs such that the network computes a valid output.
We show that the problems are conquerable in a semi-linear setting.
arXiv Detail & Related papers (2024-03-20T09:34:38Z) - Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - On the ISS Property of the Gradient Flow for Single Hidden-Layer Neural
Networks with Linear Activations [0.0]
We investigate the effects of overfitting on the robustness of gradient-descent training when subject to uncertainty on the gradient estimation.
We show that the general overparametrized formulation introduces a set of spurious equilibria which lay outside the set where the loss function is minimized.
arXiv Detail & Related papers (2023-05-17T02:26:34Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Global Optimization of Objective Functions Represented by ReLU Networks [77.55969359556032]
Neural networks can learn complex, non- adversarial functions, and it is challenging to guarantee their correct behavior in safety-critical contexts.
Many approaches exist to find failures in networks (e.g., adversarial examples), but these cannot guarantee the absence of failures.
We propose an approach that integrates the optimization process into the verification procedure, achieving better performance than the naive approach.
arXiv Detail & Related papers (2020-10-07T08:19:48Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - Globally Injective ReLU Networks [20.106755410331576]
Injectivity plays an important role in generative models where it enables inference.
We establish sharp characterizations of injectivity of fully-connected and convolutional ReLU layers and networks.
arXiv Detail & Related papers (2020-06-15T15:12:12Z) - Deep Neural Networks with Trainable Activations and Controlled Lipschitz
Constant [26.22495169129119]
We introduce a variational framework to learn the activation functions of deep neural networks.
Our aim is to increase the capacity of the network while controlling an upper-bound of the Lipschitz constant.
We numerically compare our scheme with standard ReLU network and its variations, PReLU and LeakyReLU.
arXiv Detail & Related papers (2020-01-17T12:32:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.