Vector-output ReLU Neural Network Problems are Copositive Programs:
Convex Analysis of Two Layer Networks and Polynomial-time Algorithms
- URL: http://arxiv.org/abs/2012.13329v1
- Date: Thu, 24 Dec 2020 17:03:30 GMT
- Title: Vector-output ReLU Neural Network Problems are Copositive Programs:
Convex Analysis of Two Layer Networks and Polynomial-time Algorithms
- Authors: Arda Sahiner, Tolga Ergen, John Pauly and Mert Pilanci
- Abstract summary: We describe the semi-output global dual of the two-layer vector-infinite ReLU neural network training problem.
We provide a solution which is guaranteed to be exact for certain classes of problems.
- Score: 29.975118690758126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We describe the convex semi-infinite dual of the two-layer vector-output ReLU
neural network training problem. This semi-infinite dual admits a finite
dimensional representation, but its support is over a convex set which is
difficult to characterize. In particular, we demonstrate that the non-convex
neural network training problem is equivalent to a finite-dimensional convex
copositive program. Our work is the first to identify this strong connection
between the global optima of neural networks and those of copositive programs.
We thus demonstrate how neural networks implicitly attempt to solve copositive
programs via semi-nonnegative matrix factorization, and draw key insights from
this formulation. We describe the first algorithms for provably finding the
global minimum of the vector output neural network training problem, which are
polynomial in the number of samples for a fixed data rank, yet exponential in
the dimension. However, in the case of convolutional architectures, the
computational complexity is exponential in only the filter size and polynomial
in all other parameters. We describe the circumstances in which we can find the
global optimum of this neural network training problem exactly with
soft-thresholded SVD, and provide a copositive relaxation which is guaranteed
to be exact for certain classes of problems, and which corresponds with the
solution of Stochastic Gradient Descent in practice.
Related papers
- Convex Formulations for Training Two-Layer ReLU Neural Networks [21.88871868680998]
Non-layer, NP-hard optimization problems are crucial for machine learning models.
We introduce a semidefinite relaxation that can be solved in a finite-width two neural network.
arXiv Detail & Related papers (2024-10-29T17:53:15Z) - LinSATNet: The Positive Linear Satisfiability Neural Networks [116.65291739666303]
This paper studies how to introduce the popular positive linear satisfiability to neural networks.
We propose the first differentiable satisfiability layer based on an extension of the classic Sinkhorn algorithm for jointly encoding multiple sets of marginal distributions.
arXiv Detail & Related papers (2024-07-18T22:05:21Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Achieving Small Test Error in Mildly Overparameterized Neural Networks [30.664282759625948]
We show an algorithm which finds one of these points in time.
In addition, we prove that for a fully connected neural net, with an additional assumption on the data distribution, there is a time algorithm.
arXiv Detail & Related papers (2021-04-24T06:47:20Z) - Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization
of Polynomial Activation Neural Networks in Fully Polynomial-Time [31.94590517036704]
We develop exact convex optimization formulations for two-layer numerical networks with second degree activations.
We show that semidefinite neural and therefore global optimization is in complexity dimension and sample size for all input data.
The proposed approach is significantly faster to obtain better test accuracy compared to the standard backpropagation procedure.
arXiv Detail & Related papers (2021-01-07T08:43:01Z) - Neural network approaches to point lattice decoding [6.025026882312586]
Voronoi-reduced basis is introduced to restrict the space of solutions to a binary set.
We count the number of affine pieces in the CPWL decoding function to characterize the complexity of the decoding problem.
arXiv Detail & Related papers (2020-12-13T10:53:34Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural
Networks: an Exact Characterization of the Optimal Solutions [51.60996023961886]
We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints.
Our analysis is novel, characterizes all optimal solutions, and does not leverage duality-based analysis which was recently used to lift neural network training into convex spaces.
arXiv Detail & Related papers (2020-06-10T15:38:30Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.