Mean-Field and Kinetic Descriptions of Neural Differential Equations
- URL: http://arxiv.org/abs/2001.04294v4
- Date: Mon, 8 Nov 2021 20:56:05 GMT
- Title: Mean-Field and Kinetic Descriptions of Neural Differential Equations
- Authors: M. Herty, T. Trimborn, G. Visconti
- Abstract summary: In this work we focus on a particular class of neural networks, i.e. the residual neural networks.
We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias.
A modification of the microscopic dynamics, inspired by residual neural networks, leads to a Fokker-Planck formulation of the network.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, neural networks are widely used in many applications as artificial
intelligence models for learning tasks. Since typically neural networks process
a very large amount of data, it is convenient to formulate them within the
mean-field and kinetic theory. In this work we focus on a particular class of
neural networks, i.e. the residual neural networks, assuming that each layer is
characterized by the same number of neurons $N$, which is fixed by the
dimension of the data. This assumption allows to interpret the residual neural
network as a time-discretized ordinary differential equation, in analogy with
neural differential equations. The mean-field description is then obtained in
the limit of infinitely many input data. This leads to a Vlasov-type partial
differential equation which describes the evolution of the distribution of the
input data. We analyze steady states and sensitivity with respect to the
parameters of the network, namely the weights and the bias. In the simple
setting of a linear activation function and one-dimensional input data, the
study of the moments provides insights on the choice of the parameters of the
network. Furthermore, a modification of the microscopic dynamics, inspired by
stochastic residual neural networks, leads to a Fokker-Planck formulation of
the network, in which the concept of network training is replaced by the task
of fitting distributions. The performed analysis is validated by artificial
numerical simulations. In particular, results on classification and regression
problems are presented.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Instance-wise Linearization of Neural Network for Model Interpretation [13.583425552511704]
The challenge can dive into the non-linear behavior of the neural network.
For a neural network model, the non-linear behavior is often caused by non-linear activation units of a model.
We propose an instance-wise linearization approach to reformulates the forward computation process of a neural network prediction.
arXiv Detail & Related papers (2023-10-25T02:07:39Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Embedding stochastic differential equations into neural networks via
dual processes [0.0]
We propose a new approach to constructing a neural network for predicting expectations of differential equations.
The proposed method does not need data sets of inputs and outputs.
As a demonstration, we construct neural networks for the Ornstein-Uhlenbeck process and the noisy van der Pol system.
arXiv Detail & Related papers (2023-06-08T00:50:16Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Stochastic Neural Networks with Infinite Width are Deterministic [7.07065078444922]
We study neural networks, a main type of neural network in use.
We prove that as the width of an optimized neural network tends to infinity, its predictive variance on the training set decreases to zero.
arXiv Detail & Related papers (2022-01-30T04:52:31Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Conditional physics informed neural networks [85.48030573849712]
We introduce conditional PINNs (physics informed neural networks) for estimating the solution of classes of eigenvalue problems.
We show that a single deep neural network can learn the solution of partial differential equations for an entire class of problems.
arXiv Detail & Related papers (2021-04-06T18:29:14Z) - Understanding and mitigating gradient pathologies in physics-informed
neural networks [2.1485350418225244]
This work focuses on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data.
We present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions.
We also propose a novel neural network architecture that is more resilient to such gradient pathologies.
arXiv Detail & Related papers (2020-01-13T21:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.