Neural Network Verification as Piecewise Linear Optimization:
Formulations for the Composition of Staircase Functions
- URL: http://arxiv.org/abs/2211.14706v1
- Date: Sun, 27 Nov 2022 03:25:48 GMT
- Title: Neural Network Verification as Piecewise Linear Optimization:
Formulations for the Composition of Staircase Functions
- Authors: Tu Anh-Nguyen and Joey Huchette
- Abstract summary: We present a technique for neural network verification using mixed-integer programming (MIP) formulations.
We derive a strong formulation for each neuron in a network using piecewise linear activation functions.
We also derive a separation procedure that runs in super-linear time in the input dimension.
- Score: 2.088583843514496
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a technique for neural network verification using mixed-integer
programming (MIP) formulations. We derive a \emph{strong formulation} for each
neuron in a network using piecewise linear activation functions. Additionally,
as in general, these formulations may require an exponential number of
inequalities, we also derive a separation procedure that runs in super-linear
time in the input dimension. We first introduce and develop our technique on
the class of \emph{staircase} functions, which generalizes the ReLU, binarized,
and quantized activation functions. We then use results for staircase
activation functions to obtain a separation method for general piecewise linear
activation functions. Empirically, using our strong formulation and separation
technique, we can reduce the computational time in exact verification settings
based on MIP and improve the false negative rate for inexact verifiers relying
on the relaxation of the MIP formulation.
Related papers
- Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Unification of popular artificial neural network activation functions [0.0]
We present a unified representation of the most popular neural network activation functions.
Adopting Mittag-Leffler functions of fractional calculus, we propose a flexible and compact functional form.
arXiv Detail & Related papers (2023-02-21T21:20:59Z) - Lifted Bregman Training of Neural Networks [28.03724379169264]
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions.
This formulation is based on Bregman and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions.
We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding.
arXiv Detail & Related papers (2022-08-18T11:12:52Z) - Neural networks with trainable matrix activation functions [7.999703756441757]
This work develops a systematic approach to constructing matrix-valued activation functions.
The proposed activation functions depend on parameters that are trained along with the weights and bias vectors.
arXiv Detail & Related papers (2021-09-21T04:11:26Z) - Otimizacao de pesos e funcoes de ativacao de redes neurais aplicadas na
previsao de series temporais [0.0]
We propose the use of a family of free parameter asymmetric activation functions for neural networks.
We show that this family of defined activation functions satisfies the requirements of the universal approximation theorem.
A methodology for the global optimization of this family of activation functions with free parameter and the weights of the connections between the processing units of the neural network is used.
arXiv Detail & Related papers (2021-07-29T23:32:15Z) - Estimating Multiplicative Relations in Neural Networks [0.0]
We will use properties of logarithmic functions to propose a pair of activation functions which can translate products into linear expression and learn using backpropagation.
We will try to generalize this approach for some complex arithmetic functions and test the accuracy on a disjoint distribution with the training set.
arXiv Detail & Related papers (2020-10-28T14:28:24Z) - Activation Relaxation: A Local Dynamical Approximation to
Backpropagation in the Brain [62.997667081978825]
Activation Relaxation (AR) is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system.
Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, and can operate on arbitrary computation graphs.
arXiv Detail & Related papers (2020-09-11T11:56:34Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.