Related papers: Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

URL: http://arxiv.org/abs/2008.03920v3
Date: Thu, 27 Oct 2022 18:01:18 GMT
Title: Do ideas have shape? Idea registration as the continuous limit of artificial neural networks
Authors: Houman Owhadi
Abstract summary: We show that ResNets converge, in the infinite depth limit, to a generalization of image registration variational algorithms. We present the first rigorous proof of convergence of ResNets with trained weights and biases towards a Hamiltonian dynamics driven flow.
Score: 0.609170287691728
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a GP generalization of ResNets (including ResNets as a particular case). We show that ResNets (and their GP generalization) converge, in the infinite depth limit, to a generalization of image registration variational algorithms. Whereas computational anatomy aligns images via warping of the material space, this generalization aligns ideas (or abstract shapes as in Plato's theory of forms) via the warping of the RKHS of functions mapping the input space to the output space. While the Hamiltonian interpretation of ResNets is not new, it was based on an Ansatz. We do not rely on this Ansatz and present the first rigorous proof of convergence of ResNets with trained weights and biases towards a Hamiltonian dynamics driven flow. Our constructive proof reveals several remarkable properties of ResNets and their GP generalization. ResNets regressors are kernel regressors with data-dependent warping kernels. Minimizers of $L_2$ regularized ResNets satisfy a discrete least action principle implying the near preservation of the norm of weights and biases across layers. The trained weights of ResNets with $L^2$ regularization can be identified by solving an autonomous Hamiltonian system. The trained ResNet parameters are unique up to the initial momentum whose representation is generally sparse. The kernel regularization strategy provides a provably robust alternative to Dropout for ANNs. We introduce a functional generalization of GPs leading to error estimates for ResNets. We identify the (EPDiff) mean fields limit of trained ResNet parameters. We show that the composition of warping regression blocks with reduced equivariant multichannel kernels (introduced here) recovers and generalizes CNNs to arbitrary spaces and groups of transformations.

Related papers

Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets [58.460298576330835]
We study Leaky ResNets, which interpolate between ResNets ($tildeLtoinfty$) and Fully-Connected nets ($tildeLtoinfty$) In the infinite depth limit, we study'representation geodesics' $A_p$: continuous paths in representation space (similar to NeuralODEs) We leverage this intuition to explain the emergence of a bottleneck structure, as observed in previous work.
arXiv Detail & Related papers (2024-05-27T18:15:05Z)
A Rescaling-Invariant Lipschitz Bound Based on Path-Metrics for Modern ReLU Network Parameterizations [13.894485461969772]
We prove a new Lipschitz inequality expressed through the $ell1$-path-metric of the weights.<n>It applies to any ReLU-DAG architecture with any combination of convolutions, skip connections, pooling, and frozen (in-time) batch-normalization.<n>By respecting the network's natural symmetries, the new bound strictly sharpens prior parameter-space bounds and can be computed in two forward passes.
arXiv Detail & Related papers (2024-05-23T19:23:09Z)
Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks. Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Global convergence of ResNets: From finite to infinite width using linear parameterization [0.0]
We study Residual Networks (ResNets) in which the residual block has linear parametrization while still being nonlinear. In this limit, we prove a local Polyak-Lojasiewicz inequality, retrieving the lazy regime. Our analysis leads to a practical and quantified recipe.
arXiv Detail & Related papers (2021-12-10T13:38:08Z)
Approximation properties of Residual Neural Networks for Kolmogorov PDEs [0.0]
We show that ResNets are able to approximate Kolmogorov partial differential equations with constant diffusion and possibly nonlinear gradient coefficients. In contrast to FNNs, the Euler-Maruyama approximation structure of ResNets simplifies the construction of the approximating ResNets substantially.
arXiv Detail & Related papers (2021-10-30T09:28:49Z)
Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks. We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z)
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization [18.613475245655806]
We study ReLU ResNets in the infinite-depth-and-width limit, where both depth and width tend to infinity as their ratio, $d/n$, remains constant. Using Monte Carlo simulations, we demonstrate that even basic properties of standard ResNet architectures are poorly captured by the Gaussian limit.
arXiv Detail & Related papers (2021-06-07T23:47:37Z)
Momentum Residual Neural Networks [22.32840998053339]
We propose to change the forward rule of a ResNet by adding a momentum term. MomentumNets can be used as a drop-in replacement for any existing ResNet block. We show that MomentumNets have the same accuracy as ResNets, while having a much smaller memory footprint.
arXiv Detail & Related papers (2021-02-15T22:24:52Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
On Random Kernels of Residual Architectures [93.94469470368988]
We derive finite width and depth corrections for the Neural Tangent Kernel (NTK) of ResNets and DenseNets. Our findings show that in ResNets, convergence to the NTK may occur when depth and width simultaneously tend to infinity. In DenseNets, however, convergence of the NTK to its limit as the width tends to infinity is guaranteed.
arXiv Detail & Related papers (2020-01-28T16:47:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.