Related papers: Understanding and mitigating gradient pathologies in physics-informed neural networks

Understanding and mitigating gradient pathologies in physics-informed neural networks

URL: http://arxiv.org/abs/2001.04536v1
Date: Mon, 13 Jan 2020 21:23:49 GMT
Title: Understanding and mitigating gradient pathologies in physics-informed neural networks
Authors: Sifan Wang, Yujun Teng, Paris Perdikaris
Abstract summary: This work focuses on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data. We present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions. We also propose a novel neural network architecture that is more resilient to such gradient pathologies.
Score: 2.1485350418225244
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The widespread use of neural networks across different scientific domains often involves constraining them to satisfy certain symmetries, conservation laws, or other domain knowledge. Such constraints are often imposed as soft penalties during model training and effectively act as domain-specific regularizers of the empirical risk loss. Physics-informed neural networks is an example of this philosophy in which the outputs of deep neural networks are constrained to approximately satisfy a given set of partial differential equations. In this work we review recent advances in scientific machine learning with a specific focus on the effectiveness of physics-informed neural networks in predicting outcomes of physical systems and discovering hidden physics from noisy data. We will also identify and analyze a fundamental mode of failure of such approaches that is related to numerical stiffness leading to unbalanced back-propagated gradients during model training. To address this limitation we present a learning rate annealing algorithm that utilizes gradient statistics during model training to balance the interplay between different terms in composite loss functions. We also propose a novel neural network architecture that is more resilient to such gradient pathologies. Taken together, our developments provide new insights into the training of constrained neural networks and consistently improve the predictive accuracy of physics-informed neural networks by a factor of 50-100x across a range of problems in computational physics. All code and data accompanying this manuscript are publicly available at \url{https://github.com/PredictiveIntelligenceLab/GradientPathologiesPINNs}.

Related papers

A Dynamical Systems Perspective on the Analysis of Neural Networks [0.0]
We utilize dynamical systems to analyze several aspects of machine learning algorithms.<n>We demonstrate how to re-formulate a variety of challenges from deep neural networks, (stochastic) gradient descent, and related topics into dynamical statements.
arXiv Detail & Related papers (2025-07-07T16:18:49Z)
Are Two Hidden Layers Still Enough for the Physics-Informed Neural Networks? [0.0]
The article discusses the development of various methods and techniques for initializing and training neural networks with a single hidden layer. The proposed methods have been extended to 2D problems using the separable physics-informed neural networks approach.
arXiv Detail & Related papers (2024-12-26T14:30:54Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Can physical information aid the generalization ability of Neural Networks for hydraulic modeling? [0.0]
Application of Neural Networks to river hydraulics is fledgling, despite the field suffering from data scarcity. We propose to mitigate such problem by introducing physical information into the training phase. We show that incorporating such soft physical information can improve predictive capabilities.
arXiv Detail & Related papers (2024-03-13T14:51:16Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
An Analysis of Physics-Informed Neural Networks [0.0]
We present a new approach to approximating the solution to physical systems - physics-informed neural networks. The concept of artificial neural networks is introduced, the objective function is defined, and optimisation strategies are discussed. The partial differential equation is then included as a constraint in the loss function for the problem, giving the network access to knowledge of the dynamics of the physical system it is modelling.
arXiv Detail & Related papers (2023-03-06T04:45:53Z)
NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger. It decomposing the original learning tasks into several coarser-resolution subtasks. We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Physics-informed ConvNet: Learning Physical Field from a Shallow Neural Network [0.180476943513092]
Modelling and forecasting multi-physical systems remain a challenge due to unavoidable data scarcity and noise. New framework named physics-informed convolutional network (PICN) is recommended from a CNN perspective. PICN may become an alternative neural network solver in physics-informed machine learning.
arXiv Detail & Related papers (2022-01-26T14:35:58Z)
Physics informed neural networks for continuum micromechanics [68.8204255655161]
Recently, physics informed neural networks have successfully been applied to a broad variety of problems in applied mathematics and engineering. Due to the global approximation, physics informed neural networks have difficulties in displaying localized effects and strong non-linear solutions by optimization. It is shown, that the domain decomposition approach is able to accurately resolve nonlinear stress, displacement and energy fields in heterogeneous microstructures obtained from real-world $mu$CT-scans.
arXiv Detail & Related papers (2021-10-14T14:05:19Z)
Conditional physics informed neural networks [85.48030573849712]
We introduce conditional PINNs (physics informed neural networks) for estimating the solution of classes of eigenvalue problems. We show that a single deep neural network can learn the solution of partial differential equations for an entire class of problems.
arXiv Detail & Related papers (2021-04-06T18:29:14Z)
A deep learning theory for neural networks grounded in physics [2.132096006921048]
We argue that building large, fast and efficient neural networks on neuromorphic architectures requires rethinking the algorithms to implement and train them. Our framework applies to a very broad class of models, namely systems whose state or dynamics are described by variational equations.
arXiv Detail & Related papers (2021-03-18T02:12:48Z)
Mean-Field and Kinetic Descriptions of Neural Differential Equations [0.0]
In this work we focus on a particular class of neural networks, i.e. the residual neural networks. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. A modification of the microscopic dynamics, inspired by residual neural networks, leads to a Fokker-Planck formulation of the network.
arXiv Detail & Related papers (2020-01-07T13:41:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.