Nonperturbative renormalization for the neural network-QFT
correspondence
- URL: http://arxiv.org/abs/2108.01403v1
- Date: Tue, 3 Aug 2021 10:36:04 GMT
- Title: Nonperturbative renormalization for the neural network-QFT
correspondence
- Authors: Harold Erbin, Vincent Lahoche and Dine Ousmane Samary
- Abstract summary: We study the concepts of locality and power-counting in this context.
We provide an analysis in terms of the nonperturbative renormalization group using the Wetterich-Morris equation.
Our aim is to provide a useful formalism to investigate neural networks behavior beyond the large-width limit.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a recent work arXiv:2008.08601, Halverson, Maiti and Stoner proposed a
description of neural networks in terms of a Wilsonian effective field theory.
The infinite-width limit is mapped to a free field theory, while finite $N$
corrections are taken into account by interactions (non-Gaussian terms in the
action). In this paper, we study two related aspects of this correspondence.
First, we comment on the concepts of locality and power-counting in this
context. Indeed, these usual space-time notions may not hold for neural
networks (since inputs can be arbitrary), however, the renormalization group
provides natural notions of locality and scaling. Moreover, we comment on
several subtleties, for example, that data components may not have a
permutation symmetry: in that case, we argue that random tensor field theories
could provide a natural generalization. Second, we improve the perturbative
Wilsonian renormalization from arXiv:2008.08601 by providing an analysis in
terms of the nonperturbative renormalization group using the Wetterich-Morris
equation. An important difference with usual nonperturbative RG analysis is
that only the effective (IR) 2-point function is known, which requires setting
the problem with care. Our aim is to provide a useful formalism to investigate
neural networks behavior beyond the large-width limit (i.e.~far from Gaussian
limit) in a nonperturbative fashion. A major result of our analysis is that
changing the standard deviation of the neural network weight distribution can
be interpreted as a renormalization flow in the space of networks. We focus on
translations invariant kernels and provide preliminary numerical results.
Related papers
- Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Renormalization in the neural network-quantum field theory
correspondence [0.0]
A statistical ensemble of neural networks can be described in terms of a quantum field theory.
A major outcome is that changing the standard deviation of the neural network weight distribution corresponds to a renormalization flow in the space of networks.
arXiv Detail & Related papers (2022-12-22T15:41:13Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - A connection between probability, physics and neural networks [0.0]
We illustrate an approach that can be exploited for constructing neural networks which a priori obeys physical laws.
We start with a simple single-layer neural network (NN) but refrain from choosing the activation functions yet.
The activation functions constructed in this way guarantee the NN to a priori obey the physics, up to the approximation error of non-infinite network width.
arXiv Detail & Related papers (2022-09-26T14:40:09Z) - ResNorm: Tackling Long-tailed Degree Distribution Issue in Graph Neural
Networks via Normalization [80.90206641975375]
This paper focuses on improving the performance of GNNs via normalization.
By studying the long-tailed distribution of node degrees in the graph, we propose a novel normalization method for GNNs.
The $scale$ operation of ResNorm reshapes the node-wise standard deviation (NStd) distribution so as to improve the accuracy of tail nodes.
arXiv Detail & Related papers (2022-06-16T13:49:09Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - A global convergence theory for deep ReLU implicit networks via
over-parameterization [26.19122384935622]
Implicit deep learning has received increasing attention recently.
This paper analyzes the gradient flow of Rectified Linear Unit (ReLU) activated implicit neural networks.
arXiv Detail & Related papers (2021-10-11T23:22:50Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Neural Networks and Quantum Field Theory [0.0]
We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory.
The correspondence relies on the fact that many neural networks are drawn from Gaussian processes.
arXiv Detail & Related papers (2020-08-19T18:00:06Z) - Characteristics of Monte Carlo Dropout in Wide Neural Networks [16.639005039546745]
Monte Carlo (MC) dropout is one of the state-of-the-art approaches for uncertainty estimation in neural networks (NNs)
We study the limiting distribution of wide untrained NNs under dropout more rigorously and prove that they as well converge to Gaussian processes for fixed sets of weights and biases.
We investigate how (strongly) correlated pre-activations can induce non-Gaussian behavior in NNs with strongly correlated weights.
arXiv Detail & Related papers (2020-07-10T15:14:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.