Exploring the loss landscape of regularized neural networks via convex duality
- URL: http://arxiv.org/abs/2411.07729v1
- Date: Tue, 12 Nov 2024 11:41:38 GMT
- Title: Exploring the loss landscape of regularized neural networks via convex duality
- Authors: Sungyoon Kim, Aaron Mishkin, Mert Pilanci,
- Abstract summary: We discuss several aspects of the loss landscape of regularized neural networks.
We first characterize the solution set of the convex problem using its dual and further characterize all stationary points.
We show that the solution set characterization and connectivity results can be extended to different architectures.
- Score: 42.48510370193192
- License:
- Abstract: We discuss several aspects of the loss landscape of regularized neural networks: the structure of stationary points, connectivity of optimal solutions, path with nonincreasing loss to arbitrary global optimum, and the nonuniqueness of optimal solutions, by casting the problem into an equivalent convex problem and considering its dual. Starting from two-layer neural networks with scalar output, we first characterize the solution set of the convex problem using its dual and further characterize all stationary points. With the characterization, we show that the topology of the global optima goes through a phase transition as the width of the network changes, and construct counterexamples where the problem may have a continuum of optimal solutions. Finally, we show that the solution set characterization and connectivity results can be extended to different architectures, including two-layer vector-valued neural networks and parallel three-layer neural networks.
Related papers
- On permutation symmetries in Bayesian neural network posteriors: a
variational perspective [8.310462710943971]
We show that there is essentially no loss barrier between the local solutions of gradient descent.
This raises questions for approximate inference in Bayesian neural networks.
We propose a matching algorithm to search for linearly connected solutions.
arXiv Detail & Related papers (2023-10-16T08:26:50Z) - Optimal Sets and Solution Paths of ReLU Networks [56.40911684005949]
We develop an analytical framework to characterize the set of optimal ReLU networks.
We establish conditions for the neuralization of ReLU networks to be continuous, and develop sensitivity results for ReLU networks.
arXiv Detail & Related papers (2023-05-31T18:48:16Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - Vector-output ReLU Neural Network Problems are Copositive Programs:
Convex Analysis of Two Layer Networks and Polynomial-time Algorithms [29.975118690758126]
We describe the semi-output global dual of the two-layer vector-infinite ReLU neural network training problem.
We provide a solution which is guaranteed to be exact for certain classes of problems.
arXiv Detail & Related papers (2020-12-24T17:03:30Z) - The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural
Networks: an Exact Characterization of the Optimal Solutions [51.60996023961886]
We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints.
Our analysis is novel, characterizes all optimal solutions, and does not leverage duality-based analysis which was recently used to lift neural network training into convex spaces.
arXiv Detail & Related papers (2020-06-10T15:38:30Z) - PFNN: A Penalty-Free Neural Network Method for Solving a Class of
Second-Order Boundary-Value Problems on Complex Geometries [4.620110353542715]
We present PFNN, a penalty-free neural network method, to solve a class of second-order boundary-value problems.
PFNN is superior to several existing approaches in terms of accuracy, flexibility and robustness.
arXiv Detail & Related papers (2020-04-14T13:36:14Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.