The star-shaped space of solutions of the spherical negative perceptron
- URL: http://arxiv.org/abs/2305.10623v2
- Date: Tue, 5 Sep 2023 16:34:10 GMT
- Title: The star-shaped space of solutions of the spherical negative perceptron
- Authors: Brandon Livio Annesi, Clarissa Lauditi, Carlo Lucibello, Enrico M.
Malatesta, Gabriele Perugini, Fabrizio Pittorino and Luca Saglietti
- Abstract summary: We show that low-energy configurations are often found in complex connected structures.
We identify a subset of atypical high-margin connected with most other solutions.
- Score: 4.511197686627054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Empirical studies on the landscape of neural networks have shown that
low-energy configurations are often found in complex connected structures,
where zero-energy paths between pairs of distant solutions can be constructed.
Here we consider the spherical negative perceptron, a prototypical non-convex
neural network model framed as a continuous constraint satisfaction problem. We
introduce a general analytical method for computing energy barriers in the
simplex with vertex configurations sampled from the equilibrium. We find that
in the over-parameterized regime the solution manifold displays simple
connectivity properties. There exists a large geodesically convex component
that is attractive for a wide range of optimization dynamics. Inside this
region we identify a subset of atypical high-margin solutions that are
geodesically connected with most other solutions, giving rise to a star-shaped
geometry. We analytically characterize the organization of the connected space
of solutions and show numerical evidence of a transition, at larger constraint
densities, where the aforementioned simple geodesic connectivity breaks down.
Related papers
- High-dimensional manifold of solutions in neural networks: insights from
statistical physics [0.0]
I review the statistical mechanics approach to neural networks, focusing on the paradigmatic example of the perceptron architecture with binary an continuous weights.
I discuss some recent works that unveiled how the zero training error configurations are geometrically arranged.
arXiv Detail & Related papers (2023-09-17T11:10:25Z) - Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and
Optimal Algorithms [64.10576998630981]
We show the first tight characterization of the optimal Hessian-dependent sample complexity.
A Hessian-independent algorithm universally achieves the optimal sample complexities for all Hessian instances.
The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions.
arXiv Detail & Related papers (2023-06-21T17:03:22Z) - Typical and atypical solutions in non-convex neural networks with
discrete and continuous weights [2.7127628066830414]
We study the binary and continuous negative-margin perceptrons as simple non-constrained network models learning random rules and associations.
Both models exhibit subdominant minimizers which are extremely flat and wide.
For both models, the generalization performance as a learning device is shown to be greatly improved by the existence of wide flat minimizers.
arXiv Detail & Related papers (2023-04-26T23:34:40Z) - Improved Training of Physics-Informed Neural Networks with Model
Ensembles [81.38804205212425]
We propose to expand the solution interval gradually to make the PINN converge to the correct solution.
All ensemble members converge to the same solution in the vicinity of observed data.
We show experimentally that the proposed method can improve the accuracy of the found solution.
arXiv Detail & Related papers (2022-04-11T14:05:34Z) - Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators.
We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes.
We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z) - Deep Networks on Toroids: Removing Symmetries Reveals the Structure of
Flat Regions in the Landscape Geometry [3.712728573432119]
We develop a standardized parameterization in which all symmetries are removed, resulting in a toroidal topology.
We derive a meaningful notion of the flatness of minimizers and of the geodesic paths connecting them.
We also find that minimizers found by variants of gradient descent can be connected by zero-error paths with a single bend.
arXiv Detail & Related papers (2022-02-07T09:57:54Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z) - Properties of the geometry of solutions and capacity of multi-layer neural networks with Rectified Linear Units activations [2.3018169548556977]
We study the effects of Rectified Linear Units on the capacity and on the geometrical landscape of the solution space in two-layer neural networks.
We find that, quite unexpectedly, the capacity of the network remains finite as the number of neurons in the hidden layer increases.
Possibly more important, a large deviation approach allows us to find that the geometrical landscape of the solution space has a peculiar structure.
arXiv Detail & Related papers (2019-07-17T15:23:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.