Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks
- URL: http://arxiv.org/abs/2303.01767v1
- Date: Fri, 3 Mar 2023 08:17:47 GMT
- Title: Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks
- Authors: Ye Li, Song-Can Chen, Sheng-Jun Huang
- Abstract summary: Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
- Score: 51.92362217307946
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Physics-informed neural networks (PINNs) have effectively been demonstrated
in solving forward and inverse differential equation problems, but they are
still trapped in training failures when the target functions to be approximated
exhibit high-frequency or multi-scale features. In this paper, we propose to
employ implicit stochastic gradient descent (ISGD) method to train PINNs for
improving the stability of training process. We heuristically analyze how ISGD
overcome stiffness in the gradient flow dynamics of PINNs, especially for
problems with multi-scale solutions. We theoretically prove that for two-layer
fully connected neural networks with large hidden nodes, randomly initialized
ISGD converges to a globally optimal solution for the quadratic loss function.
Empirical results demonstrate that ISGD works well in practice and compares
favorably to other gradient-based optimization methods such as SGD and Adam,
while can also effectively address the numerical stiffness in training dynamics
via gradient descent.
Related papers
- Dual Cone Gradient Descent for Training Physics-Informed Neural Networks [0.0]
Physics-informed dual neural networks (PINNs) have emerged as a prominent approach for solving partial differential equations.
We propose a novel framework, Dual Cone Gradient Descent (DCGD), which adjusts the direction of the updated gradient to ensure it falls within a cone region.
arXiv Detail & Related papers (2024-09-27T03:27:46Z) - Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [3.680127959836384]
implicit gradient descent (IGD) outperforms the common gradient descent (GD) in handling certain multi-scale problems.
We show that IGD converges a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture [0.0]
This work presents a two-stage adaptive framework for developing deep neural network (DNN) architectures that generalize well for a given training data set.
In the first stage, a layerwise training approach is adopted where a new layer is added each time and trained independently by freezing parameters in the previous layers.
We introduce a epsilon-delta stability-promoting concept as a desirable property for a learning algorithm and show that employing manifold regularization yields a epsilon-delta stability-promoting algorithm.
arXiv Detail & Related papers (2022-11-13T09:51:16Z) - Momentum Diminishes the Effect of Spectral Bias in Physics-Informed
Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs)
They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias.
In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z) - Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem.
CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint.
It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z) - Efficient training of physics-informed neural networks via importance
sampling [2.9005223064604078]
Physics-In Neural Networks (PINNs) are a class of deep neural networks that are trained to compute systems governed by partial differential equations (PDEs)
We show that an importance sampling approach will improve the convergence behavior of PINNs training.
arXiv Detail & Related papers (2021-04-26T02:45:10Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - Semi-Implicit Back Propagation [1.5533842336139065]
We propose a semi-implicit back propagation method for neural network training.
The difference on the neurons are propagated in a backward fashion and the parameters are updated with proximal mapping.
Experiments on both MNIST and CIFAR-10 demonstrate that the proposed algorithm leads to better performance in terms of both loss decreasing and training/validation accuracy.
arXiv Detail & Related papers (2020-02-10T03:26:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.