Neural Networks and Quantum Field Theory
- URL: http://arxiv.org/abs/2008.08601v2
- Date: Mon, 15 Mar 2021 16:52:31 GMT
- Title: Neural Networks and Quantum Field Theory
- Authors: James Halverson, Anindita Maiti, and Keegan Stoner
- Abstract summary: We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory.
The correspondence relies on the fact that many neural networks are drawn from Gaussian processes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a theoretical understanding of neural networks in terms of
Wilsonian effective field theory. The correspondence relies on the fact that
many asymptotic neural networks are drawn from Gaussian processes, the analog
of non-interacting field theories. Moving away from the asymptotic limit yields
a non-Gaussian process and corresponds to turning on particle interactions,
allowing for the computation of correlation functions of neural network outputs
with Feynman diagrams. Minimal non-Gaussian process likelihoods are determined
by the most relevant non-Gaussian terms, according to the flow in their
coefficients induced by the Wilsonian renormalization group. This yields a
direct connection between overparameterization and simplicity of neural network
likelihoods. Whether the coefficients are constants or functions may be
understood in terms of GP limit symmetries, as expected from 't Hooft's
technical naturalness. General theoretical calculations are matched to neural
network experiments in the simplest class of models allowing the
correspondence. Our formalism is valid for any of the many architectures that
becomes a GP in an asymptotic limit, a property preserved under certain types
of training.
Related papers
- A Unified Theory of Quantum Neural Network Loss Landscapes [0.0]
Quantum neural networks (QNNs) are known to not behave as Gaussian processes when randomly.
We show that QNNs and their first two derivatives generally form what we call "Wishart processes"
Our unified framework suggests a certain simple operational definition for the "trainability" of a given QNN model.
arXiv Detail & Related papers (2024-08-21T18:00:08Z) - Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - Neural Network Field Theories: Non-Gaussianity, Actions, and Locality [0.0]
Both the path integral measure in field theory and ensembles of neural networks describe distributions over functions.
An expansion in $1/N$ corresponds to interactions in the field theory, but others, such as in a small breaking of the statistical independence of network parameters, can also lead to interacting theories.
arXiv Detail & Related papers (2023-07-06T18:00:01Z) - Renormalization in the neural network-quantum field theory
correspondence [0.0]
A statistical ensemble of neural networks can be described in terms of a quantum field theory.
A major outcome is that changing the standard deviation of the neural network weight distribution corresponds to a renormalization flow in the space of networks.
arXiv Detail & Related papers (2022-12-22T15:41:13Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces.
We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator.
An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z) - Nonperturbative renormalization for the neural network-QFT
correspondence [0.0]
We study the concepts of locality and power-counting in this context.
We provide an analysis in terms of the nonperturbative renormalization group using the Wetterich-Morris equation.
Our aim is to provide a useful formalism to investigate neural networks behavior beyond the large-width limit.
arXiv Detail & Related papers (2021-08-03T10:36:04Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Generalization bound of globally optimal non-convex neural network
training: Transportation map estimation by infinite dimensional Langevin
dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error.
Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Banach Space Representer Theorems for Neural Networks and Ridge Splines [17.12783792226575]
We develop a variational framework to understand the properties of the functions learned by neural networks fit to data.
We derive a representer theorem showing that finite-width, single-hidden layer neural networks are solutions to inverse problems.
arXiv Detail & Related papers (2020-06-10T02:57:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.