A deep learning theory for neural networks grounded in physics
- URL: http://arxiv.org/abs/2103.09985v1
- Date: Thu, 18 Mar 2021 02:12:48 GMT
- Title: A deep learning theory for neural networks grounded in physics
- Authors: Benjamin Scellier
- Abstract summary: We argue that building large, fast and efficient neural networks on neuromorphic architectures requires rethinking the algorithms to implement and train them.
Our framework applies to a very broad class of models, namely systems whose state or dynamics are described by variational equations.
- Score: 2.132096006921048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the last decade, deep learning has become a major component of artificial
intelligence, leading to a series of breakthroughs across a wide variety of
domains. The workhorse of deep learning is the optimization of loss functions
by stochastic gradient descent (SGD). Traditionally in deep learning, neural
networks are differentiable mathematical functions, and the loss gradients
required for SGD are computed with the backpropagation algorithm. However, the
computer architectures on which these neural networks are implemented and
trained suffer from speed and energy inefficiency issues, due to the separation
of memory and processing in these architectures. To solve these problems, the
field of neuromorphic computing aims at implementing neural networks on
hardware architectures that merge memory and processing, just like brains do.
In this thesis, we argue that building large, fast and efficient neural
networks on neuromorphic architectures requires rethinking the algorithms to
implement and train them. To this purpose, we present an alternative
mathematical framework, also compatible with SGD, which offers the possibility
to design neural networks in substrates that directly exploit the laws of
physics. Our framework applies to a very broad class of models, namely systems
whose state or dynamics are described by variational equations. The procedure
to compute the loss gradients in such systems -- which in many practical
situations requires solely locally available information for each trainable
parameter -- is called equilibrium propagation (EqProp). Since many systems in
physics and engineering can be described by variational principles, our
framework has the potential to be applied to a broad variety of physical
systems, whose applications extend to various fields of engineering, beyond
neuromorphic computing.
Related papers
- Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - An Analysis of Physics-Informed Neural Networks [0.0]
We present a new approach to approximating the solution to physical systems - physics-informed neural networks.
The concept of artificial neural networks is introduced, the objective function is defined, and optimisation strategies are discussed.
The partial differential equation is then included as a constraint in the loss function for the problem, giving the network access to knowledge of the dynamics of the physical system it is modelling.
arXiv Detail & Related papers (2023-03-06T04:45:53Z) - NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with
Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger.
It decomposing the original learning tasks into several coarser-resolution subtasks.
We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Unsupervised Legendre-Galerkin Neural Network for Stiff Partial
Differential Equations [9.659504024299896]
We propose an unsupervised machine learning algorithm based on the Legendre-Galerkin neural network to find an accurate approximation to the solution of different types of PDEs.
The proposed neural network is applied to the general 1D and 2D PDEs as well as singularly perturbed PDEs that possess boundary layer behavior.
arXiv Detail & Related papers (2022-07-21T00:47:47Z) - Physics-informed ConvNet: Learning Physical Field from a Shallow Neural
Network [0.180476943513092]
Modelling and forecasting multi-physical systems remain a challenge due to unavoidable data scarcity and noise.
New framework named physics-informed convolutional network (PICN) is recommended from a CNN perspective.
PICN may become an alternative neural network solver in physics-informed machine learning.
arXiv Detail & Related papers (2022-01-26T14:35:58Z) - Physics informed neural networks for continuum micromechanics [68.8204255655161]
Recently, physics informed neural networks have successfully been applied to a broad variety of problems in applied mathematics and engineering.
Due to the global approximation, physics informed neural networks have difficulties in displaying localized effects and strong non-linear solutions by optimization.
It is shown, that the domain decomposition approach is able to accurately resolve nonlinear stress, displacement and energy fields in heterogeneous microstructures obtained from real-world $mu$CT-scans.
arXiv Detail & Related papers (2021-10-14T14:05:19Z) - Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems [40.20472268839781]
We generalize the reaction-diffusion equation in statistical physics, Schr"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics.
We take finite difference method to discretize NPDE for finding numerical solution.
Basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated.
arXiv Detail & Related papers (2021-03-10T00:05:46Z) - Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects.
We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations.
Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z) - Understanding and mitigating gradient pathologies in physics-informed
neural networks [2.1485350418225244]
This work focuses on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data.
We present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions.
We also propose a novel neural network architecture that is more resilient to such gradient pathologies.
arXiv Detail & Related papers (2020-01-13T21:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.