Complexity from Adaptive-Symmetries Breaking: Global Minima in the
Statistical Mechanics of Deep Neural Networks
- URL: http://arxiv.org/abs/2201.07934v1
- Date: Mon, 3 Jan 2022 09:06:44 GMT
- Title: Complexity from Adaptive-Symmetries Breaking: Global Minima in the
Statistical Mechanics of Deep Neural Networks
- Authors: Shawn W. M. Li
- Abstract summary: An antithetical concept, adaptive symmetry, to conservative symmetry in physics is proposed to understand the deep neural networks (DNNs)
We characterize the optimization process of a DNN system as an extended adaptive-symmetry-breaking process.
More specifically, this process is characterized by a statistical-mechanical model that could be appreciated as a generalization of statistics physics.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An antithetical concept, adaptive symmetry, to conservative symmetry in
physics is proposed to understand the deep neural networks (DNNs). It
characterizes the invariance of variance, where a biotic system explores
different pathways of evolution with equal probability in absence of feedback
signals, and complex functional structure emerges from quantitative
accumulation of adaptive-symmetries breaking in response to feedback signals.
Theoretically and experimentally, we characterize the optimization process of a
DNN system as an extended adaptive-symmetry-breaking process. One particular
finding is that a hierarchically large DNN would have a large reservoir of
adaptive symmetries, and when the information capacity of the reservoir exceeds
the complexity of the dataset, the system could absorb all perturbations of the
examples and self-organize into a functional structure of zero training errors
measured by a certain surrogate risk. More specifically, this process is
characterized by a statistical-mechanical model that could be appreciated as a
generalization of statistics physics to the DNN organized complex system, and
characterizes regularities in higher dimensionality. The model consists of
three constitutes that could be appreciated as the counterparts of Boltzmann
distribution, Ising model, and conservative symmetry, respectively: (1) a
stochastic definition/interpretation of DNNs that is a multilayer probabilistic
graphical model, (2) a formalism of circuits that perform biological
computation, (3) a circuit symmetry from which self-similarity between the
microscopic and the macroscopic adaptability manifests. The model is analyzed
with a method referred as the statistical assembly method that analyzes the
coarse-grained behaviors (over a symmetry group) of the heterogeneous
hierarchical many-body interaction in DNNs.
Related papers
- Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution [21.034937143252314]
We focus on learning asymmetries of data using relaxed group convolutions.
We uncover various symmetry-breaking factors that are interpretable and physically meaningful in different physical systems.
arXiv Detail & Related papers (2023-10-03T14:03:21Z) - On discrete symmetries of robotics systems: A group-theoretic and
data-driven analysis [38.92081817503126]
We study discrete morphological symmetries of dynamical systems.
These symmetries arise from the presence of one or more planes/axis of symmetry in the system's morphology.
We exploit these symmetries using data augmentation and $G$-equivariant neural networks.
arXiv Detail & Related papers (2023-02-21T04:10:16Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - A deep learning driven pseudospectral PCE based FFT homogenization
algorithm for complex microstructures [68.8204255655161]
It is shown that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
arXiv Detail & Related papers (2021-10-26T07:02:14Z) - Sampling asymmetric open quantum systems for artificial neural networks [77.34726150561087]
We present a hybrid sampling strategy which takes asymmetric properties explicitly into account, achieving fast convergence times and high scalability for asymmetric open systems.
We highlight the universal applicability of artificial neural networks, underlining the universal applicability of neural networks.
arXiv Detail & Related papers (2020-12-20T18:25:29Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Higher-order interactions in statistical physics and machine learning: A
model-independent solution to the inverse problem at equilibrium [0.0]
inverse problem of inferring pair-wise and higher-order interactions in complex systems is fundamental to many fields.
We introduce a universal, model-independent, and fundamentally unbiased estimator of all-order symmetric interactions.
arXiv Detail & Related papers (2020-06-10T18:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.