Related papers: A deep learning theory for neural networks grounded in physics

A deep learning theory for neural networks grounded in physics

URL: http://arxiv.org/abs/2103.09985v1
Date: Thu, 18 Mar 2021 02:12:48 GMT
Title: A deep learning theory for neural networks grounded in physics
Authors: Benjamin Scellier
Abstract summary: We argue that building large, fast and efficient neural networks on neuromorphic architectures requires rethinking the algorithms to implement and train them. Our framework applies to a very broad class of models, namely systems whose state or dynamics are described by variational equations.
Score: 2.132096006921048
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the last decade, deep learning has become a major component of artificial intelligence, leading to a series of breakthroughs across a wide variety of domains. The workhorse of deep learning is the optimization of loss functions by stochastic gradient descent (SGD). Traditionally in deep learning, neural networks are differentiable mathematical functions, and the loss gradients required for SGD are computed with the backpropagation algorithm. However, the computer architectures on which these neural networks are implemented and trained suffer from speed and energy inefficiency issues, due to the separation of memory and processing in these architectures. To solve these problems, the field of neuromorphic computing aims at implementing neural networks on hardware architectures that merge memory and processing, just like brains do. In this thesis, we argue that building large, fast and efficient neural networks on neuromorphic architectures requires rethinking the algorithms to implement and train them. To this purpose, we present an alternative mathematical framework, also compatible with SGD, which offers the possibility to design neural networks in substrates that directly exploit the laws of physics. Our framework applies to a very broad class of models, namely systems whose state or dynamics are described by variational equations. The procedure to compute the loss gradients in such systems -- which in many practical situations requires solely locally available information for each trainable parameter -- is called equilibrium propagation (EqProp). Since many systems in physics and engineering can be described by variational principles, our framework has the potential to be applied to a broad variety of physical systems, whose applications extend to various fields of engineering, beyond neuromorphic computing.

Related papers

Principled Approaches for Extending Neural Architectures to Function Spaces for Operator Learning [78.88684753303794]
Deep learning has predominantly advanced through applications in computer vision and natural language processing.<n>Neural operators are a principled way to generalize neural networks to mappings between function spaces.<n>This paper identifies and distills the key principles for constructing practical implementations of mappings between infinite-dimensional function spaces.
arXiv Detail & Related papers (2025-06-12T17:59:31Z)
Analog Alchemy: Neural Computation with In-Memory Inference, Learning and Routing [0.08965418284317034]
I explore an alternative way with memristive devices for neural computation, where the unique physical dynamics of the devices are used for inference, learning and routing. I will provide hardware evidence of adaptability of local learning to memristive substrates, new material stacks and circuit blocks that aid in solving the credit assignment problem and efficient routing between analog crossbars for scalable architectures.
arXiv Detail & Related papers (2024-12-30T10:35:03Z)
Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences. It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations. Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
An Analysis of Physics-Informed Neural Networks [0.0]
We present a new approach to approximating the solution to physical systems - physics-informed neural networks. The concept of artificial neural networks is introduced, the objective function is defined, and optimisation strategies are discussed. The partial differential equation is then included as a constraint in the loss function for the problem, giving the network access to knowledge of the dynamics of the physical system it is modelling.
arXiv Detail & Related papers (2023-03-06T04:45:53Z)
NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger. It decomposing the original learning tasks into several coarser-resolution subtasks. We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Unsupervised Legendre-Galerkin Neural Network for Stiff Partial Differential Equations [9.659504024299896]
We propose an unsupervised machine learning algorithm based on the Legendre-Galerkin neural network to find an accurate approximation to the solution of different types of PDEs. The proposed neural network is applied to the general 1D and 2D PDEs as well as singularly perturbed PDEs that possess boundary layer behavior.
arXiv Detail & Related papers (2022-07-21T00:47:47Z)
Physics-informed ConvNet: Learning Physical Field from a Shallow Neural Network [0.180476943513092]
Modelling and forecasting multi-physical systems remain a challenge due to unavoidable data scarcity and noise. New framework named physics-informed convolutional network (PICN) is recommended from a CNN perspective. PICN may become an alternative neural network solver in physics-informed machine learning.
arXiv Detail & Related papers (2022-01-26T14:35:58Z)
Physics informed neural networks for continuum micromechanics [68.8204255655161]
Recently, physics informed neural networks have successfully been applied to a broad variety of problems in applied mathematics and engineering. Due to the global approximation, physics informed neural networks have difficulties in displaying localized effects and strong non-linear solutions by optimization. It is shown, that the domain decomposition approach is able to accurately resolve nonlinear stress, displacement and energy fields in heterogeneous microstructures obtained from real-world $mu$CT-scans.
arXiv Detail & Related papers (2021-10-14T14:05:19Z)
Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems [40.20472268839781]
We generalize the reaction-diffusion equation in statistical physics, Schr"odinger equation in quantum mechanics, Helmholtz equation in paraxial optics. We take finite difference method to discretize NPDE for finding numerical solution. Basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated.
arXiv Detail & Related papers (2021-03-10T00:05:46Z)
Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects. We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations. Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z)
Spiking Neural Networks Hardware Implementations and Challenges: a Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles. We present the state of the art of hardware implementations of spiking neural networks. We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)
Understanding and mitigating gradient pathologies in physics-informed neural networks [2.1485350418225244]
This work focuses on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data. We present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions. We also propose a novel neural network architecture that is more resilient to such gradient pathologies.
arXiv Detail & Related papers (2020-01-13T21:23:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.