High-dimensional manifold of solutions in neural networks: insights from
statistical physics
- URL: http://arxiv.org/abs/2309.09240v1
- Date: Sun, 17 Sep 2023 11:10:25 GMT
- Title: High-dimensional manifold of solutions in neural networks: insights from
statistical physics
- Authors: Enrico M. Malatesta
- Abstract summary: I review the statistical mechanics approach to neural networks, focusing on the paradigmatic example of the perceptron architecture with binary an continuous weights.
I discuss some recent works that unveiled how the zero training error configurations are geometrically arranged.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In these pedagogic notes I review the statistical mechanics approach to
neural networks, focusing on the paradigmatic example of the perceptron
architecture with binary an continuous weights, in the classification setting.
I will review the Gardner's approach based on replica method and the derivation
of the SAT/UNSAT transition in the storage setting. Then, I discuss some recent
works that unveiled how the zero training error configurations are
geometrically arranged, and how this arrangement changes as the size of the
training set increases. I also illustrate how different regions of solution
space can be explored analytically and how the landscape in the vicinity of a
solution can be characterized. I give evidence how, in binary weight models,
algorithmic hardness is a consequence of the disappearance of a clustered
region of solutions that extends to very large distances. Finally, I
demonstrate how the study of linear mode connectivity between solutions can
give insights into the average shape of the solution manifold.
Related papers
- Space-Variant Total Variation boosted by learning techniques in few-view tomographic imaging [0.0]
This paper focuses on the development of a space-variant regularization model for solving an under-determined linear inverse problem.
The primary objective of the proposed model is to achieve a good balance between denoising and the preservation of fine details and edges.
A convolutional neural network is designed, to approximate both the ground truth image and its gradient using an elastic loss function in its training.
arXiv Detail & Related papers (2024-04-25T08:58:41Z) - Approximation Theory, Computing, and Deep Learning on the Wasserstein Space [0.5735035463793009]
We address the challenge of approximating functions in infinite-dimensional spaces from finite samples.
Our focus is on the Wasserstein distance function, which serves as a relevant example.
We adopt three machine learning-based approaches to define functional approximants.
arXiv Detail & Related papers (2023-10-30T13:59:47Z) - Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - The star-shaped space of solutions of the spherical negative perceptron [4.511197686627054]
We show that low-energy configurations are often found in complex connected structures.
We identify a subset of atypical high-margin connected with most other solutions.
arXiv Detail & Related papers (2023-05-18T00:21:04Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - Physics informed neural networks for continuum micromechanics [68.8204255655161]
Recently, physics informed neural networks have successfully been applied to a broad variety of problems in applied mathematics and engineering.
Due to the global approximation, physics informed neural networks have difficulties in displaying localized effects and strong non-linear solutions by optimization.
It is shown, that the domain decomposition approach is able to accurately resolve nonlinear stress, displacement and energy fields in heterogeneous microstructures obtained from real-world $mu$CT-scans.
arXiv Detail & Related papers (2021-10-14T14:05:19Z) - Learning through atypical ''phase transitions'' in overparameterized
neural networks [0.43496401697112685]
Current deep neural networks are highly observableized (up to billions of connection weights) and nonlinear.
Yet they can fit data almost perfectly through overdense descent algorithms and achieve unexpected accuracy prediction.
These are formidable challenges without generalization.
arXiv Detail & Related papers (2021-10-01T23:28:07Z) - GELATO: Geometrically Enriched Latent Model for Offline Reinforcement
Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods.
In this work, we demonstrate the benefit of combining the two in a latent variational model.
Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z) - Deep Magnification-Flexible Upsampling over 3D Point Clouds [103.09504572409449]
We propose a novel end-to-end learning-based framework to generate dense point clouds.
We first formulate the problem explicitly, which boils down to determining the weights and high-order approximation errors.
Then, we design a lightweight neural network to adaptively learn unified and sorted weights as well as the high-order refinements.
arXiv Detail & Related papers (2020-11-25T14:00:18Z) - MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space.
We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z) - Properties of the geometry of solutions and capacity of multi-layer neural networks with Rectified Linear Units activations [2.3018169548556977]
We study the effects of Rectified Linear Units on the capacity and on the geometrical landscape of the solution space in two-layer neural networks.
We find that, quite unexpectedly, the capacity of the network remains finite as the number of neurons in the hidden layer increases.
Possibly more important, a large deviation approach allows us to find that the geometrical landscape of the solution space has a peculiar structure.
arXiv Detail & Related papers (2019-07-17T15:23:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.