Related papers: Expressivity of ReLU-Networks under Convex Relaxations

Expressivity of ReLU-Networks under Convex Relaxations

URL: http://arxiv.org/abs/2311.04015v1
Date: Tue, 7 Nov 2023 14:14:15 GMT
Title: Expressivity of ReLU-Networks under Convex Relaxations
Authors: Maximilian Baader, Mark Niklas M\"uller, Yuhao Mao, Martin Vechev
Abstract summary: We conduct the first in-depth study on the expressive power of ReLU networks across all commonly used convex relaxations. We show that: (i) more advanced relaxations allow a larger class of univariate functions to be expressed as precisely analyzable ReLU networks, (ii) more precise relaxations can allow exponentially larger solution spaces of ReLU networks encoding the same functions, and (iii) even using the most precise single-neuron relaxations, it is impossible to construct precisely analyzable ReLU networks.
Score: 7.043624904936254
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Convex relaxations are a key component of training and certifying provably safe neural networks. However, despite substantial progress, a wide and poorly understood accuracy gap to standard networks remains, raising the question of whether this is due to fundamental limitations of convex relaxations. Initial work investigating this question focused on the simple and widely used IBP relaxation. It revealed that some univariate, convex, continuous piecewise linear (CPWL) functions cannot be encoded by any ReLU network such that its IBP-analysis is precise. To explore whether this limitation is shared by more advanced convex relaxations, we conduct the first in-depth study on the expressive power of ReLU networks across all commonly used convex relaxations. We show that: (i) more advanced relaxations allow a larger class of univariate functions to be expressed as precisely analyzable ReLU networks, (ii) more precise relaxations can allow exponentially larger solution spaces of ReLU networks encoding the same functions, and (iii) even using the most precise single-neuron relaxations, it is impossible to construct precisely analyzable ReLU networks that express multivariate, convex, monotone CPWL functions.

Related papers

Multi-Neuron Unleashes Expressivity of ReLU Networks Under Convex Relaxation [2.775510076780097]
We show that (layer-wise) multi-neuron relaxation provides complete certification for general ReLU networks. We show that the expressivity of ReLU networks is no longer limited under multi-neuron relaxation.
arXiv Detail & Related papers (2024-10-09T12:14:24Z)
Equidistribution-based training of Free Knot Splines and ReLU Neural Networks [0.0]
We consider the problem of one-dimensional function approximation using shallow neural networks (NN) with a rectified linear unit (ReLU) activation function. We show that their ill-conditioning degrades rapidly as the width of the network increases. We leverage the theory of optimal piecewise linear interpolants to improve the training procedure for a ReLU NN.
arXiv Detail & Related papers (2024-07-02T10:51:36Z)
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data. A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z)
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks [53.95175206863992]
We study the type of solutions to which gradient descent converges when used to train a single hidden-layer multivariate ReLU network with the quadratic loss. We prove that although shallow ReLU networks are universal approximators, stable shallow networks are not.
arXiv Detail & Related papers (2023-06-30T09:17:39Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks [66.76034024335833]
We investigate why diverse/ complex features are learned by the backbone, and their brittleness is due to the linear classification head relying primarily on the simplest features. We propose Feature Reconstruction Regularizer (FRR) to ensure that the learned features can be reconstructed back from the logits. We demonstrate up to 15% gains in OOD accuracy on the recently introduced semi-synthetic datasets with extreme distribution shifts.
arXiv Detail & Related papers (2022-10-04T04:01:15Z)
DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting [70.62923754433461]
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non- optimization problem. We propose a novel method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller subproblems that often have analytical solutions.
arXiv Detail & Related papers (2021-06-16T20:43:49Z)
The Convex Relaxation Barrier, Revisited: Tightened Single-Neuron Relaxations for Neural Network Verification [11.10637926254491]
We improve the effectiveness of propagation- and linear-optimization-based neural network verification algorithms with a new tightened convex relaxation for ReLU neurons. We design two-time algorithms for neural network verification: a linear-programming-based algorithm that leverages the full power of our relaxation, and a fast propagation algorithm that generalizes existing approaches.
arXiv Detail & Related papers (2020-06-24T22:16:58Z)
PEREGRiNN: Penalized-Relaxation Greedy Neural Network Verifier [1.1011268090482575]
We introduce a new approach to formally verify the most commonly considered safety specifications for ReLU NNs. We use a convex solver not only as a linear feasibility checker, but also as a means of penalizing the amount of relaxation allowed in solutions.
arXiv Detail & Related papers (2020-06-18T21:33:07Z)
On Sparsity in Overparametrised Shallow ReLU Networks [42.33056643582297]
We study the ability of different regularisation strategies to capture solutions requiring only a finite amount of neurons, even on the infinitely wide regime. We establish that both schemes are minimised by functions having only a finite number of neurons, irrespective of the amount of overparametrisation.
arXiv Detail & Related papers (2020-06-18T01:35:26Z)
Improving the Tightness of Convex Relaxation Bounds for Training Certifiably Robust Classifiers [72.56180590447835]
Convex relaxations are effective for certifying training and neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness. We propose two experiments that can be used to train neural networks that can be trained in higher certified accuracy than non-regularized baselines.
arXiv Detail & Related papers (2020-02-22T20:19:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.