Related papers: Geometry Perspective Of Estimating Learning Capability Of Neural Networks

Geometry Perspective Of Estimating Learning Capability Of Neural Networks

URL: http://arxiv.org/abs/2011.04588v2
Date: Tue, 1 Dec 2020 07:32:38 GMT
Title: Geometry Perspective Of Estimating Learning Capability Of Neural Networks
Authors: Ankan Dutta and Arnab Rakshit
Abstract summary: The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with gradient descent (SGD) The relationship between the generalization capability with the stability of the neural network has also been discussed. By correlating the principles of high-energy physics with the learning theory of neural networks, the paper establishes a variant of the Complexity-Action conjecture from an artificial neural network perspective.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The paper uses statistical and differential geometric motivation to acquire prior information about the learning capability of an artificial neural network on a given dataset. The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with stochastic gradient descent (SGD). The system characteristics at two critical epochs in the learning trajectory are analyzed. During some epochs of the training phase, the system reaches equilibrium with the generalization capability attaining a maximum. The system can also be coherent with localized, non-equilibrium states, which is characterized by the stabilization of the Hessian matrix. The paper proves that neural networks with higher generalization capability will have a slower convergence rate. The relationship between the generalization capability with the stability of the neural network has also been discussed. By correlating the principles of high-energy physics with the learning theory of neural networks, the paper establishes a variant of the Complexity-Action conjecture from an artificial neural network perspective.

Related papers

Towards a Statistical Understanding of Neural Networks: Beyond the Neural Tangent Kernel Theories [13.949362600389088]
A primary advantage of neural networks lies in their feature learning characteristics. We propose a new paradigm for studying feature learning and the resulting benefits in generalizability.
arXiv Detail & Related papers (2024-12-25T03:03:58Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Nonlinear classification of neural manifolds with contextual information [6.292933471495322]
manifold capacity has emerged as a promising framework linking population geometry to the separability of neural manifold. We propose a theoretical framework that overcomes this limitation by leveraging contextual input information. Our framework's increased expressivity captures representation untanglement in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis.
arXiv Detail & Related papers (2024-05-10T23:37:31Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics [6.349503549199403]
We provide a comprehensive framework for the learning process of deep wide neural networks. By characterizing the diffusive phase, our work sheds light on representational drift in the brain.
arXiv Detail & Related papers (2023-09-08T18:00:01Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights. We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z)
Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency. Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Learn Like The Pro: Norms from Theory to Size Neural Computation [3.848947060636351]
We investigate how dynamical systems with nonlinearities can inform the design of neural systems that seek to emulate them. We propose a Learnability metric and quantify its associated features to the near-equilibrium behavior of learning dynamics. It reveals exact sizing for a class of neural networks with multiplicative nodes that mimic continuous- or discrete-time dynamics.
arXiv Detail & Related papers (2021-06-21T20:58:27Z)
Explainable artificial intelligence for mechanics: physics-informing neural networks for constitutive models [0.0]
In mechanics, the new and active field of physics-informed neural networks attempts to mitigate this disadvantage by designing deep neural networks on the basis of mechanical knowledge. We propose a first step towards a physics-forming-in approach, which explains neural networks trained on mechanical data a posteriori. Therein, the principal component analysis decorrelates the distributed representations in cell states of RNNs and allows the comparison to known and fundamental functions.
arXiv Detail & Related papers (2021-04-20T18:38:52Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Mastering high-dimensional dynamics with Hamiltonian neural networks [0.0]
A map building perspective elucidates the superiority of Hamiltonian neural networks over conventional neural networks. The results clarify the critical relation between data, dimension, and neural network learning performance.
arXiv Detail & Related papers (2020-07-28T21:14:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.