Related papers: Connections between Numerical Algorithms for PDEs and Neural Networks

Connections between Numerical Algorithms for PDEs and Neural Networks

URL: http://arxiv.org/abs/2107.14742v1
Date: Fri, 30 Jul 2021 16:42:45 GMT
Title: Connections between Numerical Algorithms for PDEs and Neural Networks
Authors: Tobias Alt, Karl Schrader, Matthias Augustin, Pascal Peter, Joachim Weickert
Abstract summary: We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural networks. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks.
Score: 8.660429288575369
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural architectures. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks. Besides structural insights we provide concrete examples and experimental evaluations of the resulting architectures. Using the example of generalised nonlinear diffusion in 1D, we consider explicit schemes, acceleration strategies thereof, implicit schemes, and multigrid approaches. We connect these concepts to residual networks, recurrent neural networks, and U-net architectures. Our findings inspire a symmetric residual network design with provable stability guarantees and justify the effectiveness of skip connections in neural networks from a numerical perspective. Moreover, we present U-net architectures that implement multigrid techniques for learning efficient solutions of partial differential equation models, and motivate uncommon design choices such as trainable nonmonotone activation functions. Experimental evaluations show that the proposed architectures save half of the trainable parameters and can thus outperform standard ones with the same model complexity. Our considerations serve as a basis for explaining the success of popular neural architectures and provide a blueprint for developing new mathematically well-founded neural building blocks.

Related papers

Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
Enhancing Convolutional Neural Networks with Higher-Order Numerical Difference Methods [6.26650196870495]
Convolutional Neural Networks (CNNs) have been able to assist humans in solving many real-world problems. This paper proposes a stacking scheme based on the linear multi-step method to enhance the performance of CNNs.
arXiv Detail & Related papers (2024-09-08T05:13:58Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Systematic construction of continuous-time neural networks for linear dynamical systems [0.0]
We discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems. We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE) Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system.
arXiv Detail & Related papers (2024-03-24T16:16:41Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems. The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients. Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z)
Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z)
Polynomial-Spline Neural Networks with Exact Integrals [0.0]
We develop a novel neural network architecture that combines a mixture-of-experts model with free knot B1-spline basis functions. Our architecture exhibits both $h$- and $p$- refinement for regression problems at the convergence rates expected from approximation theory. We demonstrate the success of our network on a range of regression and variational problems that illustrate the consistency and exact integrability of our network architecture.
arXiv Detail & Related papers (2021-10-26T22:12:37Z)
Translating Numerical Concepts for PDEs into Neural Architectures [9.460896836770534]
We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets.
arXiv Detail & Related papers (2021-03-29T08:31:51Z)
SPINN: Sparse, Physics-based, and Interpretable Neural Networks for PDEs [0.0]
We introduce a class of Sparse, Physics-based, and Interpretable Neural Networks (SPINN) for solving ordinary and partial differential equations. By reinterpreting a traditional meshless representation of solutions of PDEs as a special sparse deep neural network, we develop a class of sparse neural network architectures that are interpretable.
arXiv Detail & Related papers (2021-02-25T17:45:50Z)
NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior. We propose to search for neural architectures that capture stronger image priors. We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.