Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks
- URL: http://arxiv.org/abs/2206.14862v1
- Date: Wed, 29 Jun 2022 19:03:10 GMT
- Title: Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks
- Authors: Ghazal Farhani, Alexander Kazachek, Boyu Wang
- Abstract summary: Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
- Score: 72.09574528342732
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Physics-informed neural network (PINN) algorithms have shown promising
results in solving a wide range of problems involving partial differential
equations (PDEs). However, they often fail to converge to desirable solutions
when the target function contains high-frequency features, due to a phenomenon
known as spectral bias. In the present work, we exploit neural tangent kernels
(NTKs) to investigate the training dynamics of PINNs evolving under stochastic
gradient descent with momentum (SGDM). This demonstrates SGDM significantly
reduces the effect of spectral bias. We have also examined why training a model
via the Adam optimizer can accelerate the convergence while reducing the
spectral bias. Moreover, our numerical experiments have confirmed that
wide-enough networks using SGDM still converge to desirable solutions, even in
the presence of high-frequency features. In fact, we show that the width of a
network plays a critical role in convergence.
Related papers
- Understanding the dynamics of the frequency bias in neural networks [0.0]
Recent works have shown that traditional Neural Network (NN) architectures display a marked frequency bias in the learning process.
We develop a partial differential equation (PDE) that unravels the frequency dynamics of the error for a 2-layer NN.
We empirically show that the same principle extends to multi-layer NNs.
arXiv Detail & Related papers (2024-05-23T18:09:16Z) - Understanding and Mitigating Extrapolation Failures in Physics-Informed
Neural Networks [1.1510009152620668]
We study the extrapolation behavior of PINNs on a representative set of PDEs of different types.
We find that failure to extrapolate is not caused by high frequencies in the solution function, but rather by shifts in the support of the Fourier spectrum over time.
arXiv Detail & Related papers (2023-06-15T20:08:42Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Investigations on convergence behaviour of Physics Informed Neural
Networks across spectral ranges and derivative orders [0.0]
An important inference from Neural Kernel Tangent (NTK) theory is the existence of spectral bias (SB)
SB is low frequency components of the target function of a fully connected Artificial Neural Network (ANN) being learnt significantly faster than the higher frequencies during training.
This is established for Mean Square Error (MSE) loss functions with very low learning rate parameters.
It is firmly established that under normalized conditions, PINNs do exhibit strong spectral bias, and this increases with the order of the differential equation.
arXiv Detail & Related papers (2023-01-07T06:31:28Z) - Neural Operator with Regularity Structure for Modeling Dynamics Driven
by SPDEs [70.51212431290611]
Partial differential equations (SPDEs) are significant tools for modeling dynamics in many areas including atmospheric sciences and physics.
We propose the Neural Operator with Regularity Structure (NORS) which incorporates the feature vectors for modeling dynamics driven by SPDEs.
We conduct experiments on various of SPDEs including the dynamic Phi41 model and the 2d Navier-Stokes equation.
arXiv Detail & Related papers (2022-04-13T08:53:41Z) - The Spectral Bias of Polynomial Neural Networks [63.27903166253743]
Polynomial neural networks (PNNs) have been shown to be particularly effective at image generation and face recognition, where high-frequency information is critical.
Previous studies have revealed that neural networks demonstrate a $textitspectral bias$ towards low-frequency functions, which yields faster learning of low-frequency components during training.
Inspired by such studies, we conduct a spectral analysis of the Tangent Kernel (NTK) of PNNs.
We find that the $Pi$-Net family, i.e., a recently proposed parametrization of PNNs, speeds up the
arXiv Detail & Related papers (2022-02-27T23:12:43Z) - On the eigenvector bias of Fourier feature networks: From regression to
solving multi-scale PDEs with physics-informed neural networks [0.0]
We show that neural networks (PINNs) struggle in cases where the target functions to be approximated exhibit high-frequency or multi-scale features.
We construct novel architectures that employ multi-scale random observational features and justify how such coordinate embedding layers can lead to robust and accurate PINN models.
arXiv Detail & Related papers (2020-12-18T04:19:30Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.