Equivalence between algorithmic instability and transition to replica
symmetry breaking in perceptron learning systems
- URL: http://arxiv.org/abs/2111.13302v1
- Date: Fri, 26 Nov 2021 03:23:18 GMT
- Title: Equivalence between algorithmic instability and transition to replica
symmetry breaking in perceptron learning systems
- Authors: Yang Zhao, Junbin Qiu, Mingshan Xie, Haiping Huang
- Abstract summary: Binary perceptron is a model of supervised learning for the non- algorithmic optimization.
We show that the instability for breaking the replica saddle point is identical to the free energy function.
- Score: 16.065867388984078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary perceptron is a fundamental model of supervised learning for the
non-convex optimization, which is a root of the popular deep learning. Binary
perceptron is able to achieve a classification of random high-dimensional data
by computing the marginal probabilities of binary synapses. The relationship
between the algorithmic instability and the equilibrium analysis of the model
remains elusive. Here, we establish the relationship by showing that the
instability condition around the algorithmic fixed point is identical to the
instability for breaking the replica symmetric saddle point solution of the
free energy function. Therefore, our analysis provides insights towards
bridging the gap between non-convex learning dynamics and statistical mechanics
properties of more complex neural networks.
Related papers
- Two Tales of Single-Phase Contrastive Hebbian Learning [9.84489449520821]
We show that it is possible for a fully local learning algorithm named dual propagation'' to bridge the performance gap to backpropagation.
The algorithm has the drawback that its numerical stability relies on symmetric nudging, which may be restrictive in biological and analog implementations.
arXiv Detail & Related papers (2024-02-13T16:21:18Z) - Non-Parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence [65.63201894457404]
We propose a novel non-parametric learning paradigm for the identification of drift and diffusion coefficients of non-linear differential equations.
The key idea essentially consists of fitting a RKHS-based approximation of the corresponding Fokker-Planck equation to such observations.
arXiv Detail & Related papers (2023-05-24T20:43:47Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - A Causality-Based Learning Approach for Discovering the Underlying
Dynamics of Complex Systems from Partial Observations with Stochastic
Parameterization [1.2882319878552302]
This paper develops a new iterative learning algorithm for complex turbulent systems with partial observations.
It alternates between identifying model structures, recovering unobserved variables, and estimating parameters.
Numerical experiments show that the new algorithm succeeds in identifying the model structure and providing suitable parameterizations for many complex nonlinear systems.
arXiv Detail & Related papers (2022-08-19T00:35:03Z) - On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes.
We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z) - A Theoretical Overview of Neural Contraction Metrics for Learning-based
Control with Guaranteed Stability [7.963506386866862]
This paper presents a neural network model of an optimal contraction metric and corresponding differential Lyapunov function.
Its innovation lies in providing formal robustness guarantees for learning-based control frameworks.
arXiv Detail & Related papers (2021-10-02T00:28:49Z) - Contraction Theory for Nonlinear Stability Analysis and Learning-based Control: A Tutorial Overview [17.05002635077646]
Contraction theory is an analytical tool to study differential dynamics of a non-autonomous (i.e., time-varying) nonlinear system.
Its nonlinear stability analysis boils down to finding a suitable contraction metric that satisfies a stability condition expressed as a linear matrix inequality.
arXiv Detail & Related papers (2021-10-01T23:03:21Z) - Exact solutions of interacting dissipative systems via weak symmetries [77.34726150561087]
We analytically diagonalize the Liouvillian of a class Markovian dissipative systems with arbitrary strong interactions or nonlinearity.
This enables an exact description of the full dynamics and dissipative spectrum.
Our method is applicable to a variety of other systems, and could provide a powerful new tool for the study of complex driven-dissipative quantum systems.
arXiv Detail & Related papers (2021-09-27T17:45:42Z) - Statistical optimality and stability of tangent transform algorithms in
logit models [6.9827388859232045]
We provide conditions on the data generating process to derive non-asymptotic upper bounds to the risk incurred by the logistical optima.
In particular, we establish local variation of the algorithm without any assumptions on the data-generating process.
We explore a special case involving a semi-orthogonal design under which a global convergence is obtained.
arXiv Detail & Related papers (2020-10-25T05:15:13Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.