The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks
- URL: http://arxiv.org/abs/2402.03864v3
- Date: Thu, 31 Oct 2024 10:59:05 GMT
- Title: The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks
- Authors: Andrea Bonfanti, Giuseppe Bruno, Cristina Cipriani,
- Abstract summary: We show how the NTK perspective falls short in the nonlinear scenario.
We explore the convergence guarantees of such methods in both linear and nonlinear cases.
- Score: 0.0
- License:
- Abstract: The Neural Tangent Kernel (NTK) viewpoint is widely employed to analyze the training dynamics of overparameterized Physics-Informed Neural Networks (PINNs). However, unlike the case of linear Partial Differential Equations (PDEs), we show how the NTK perspective falls short in the nonlinear scenario. Specifically, we establish that the NTK yields a random matrix at initialization that is not constant during training, contrary to conventional belief. Another significant difference from the linear regime is that, even in the idealistic infinite-width limit, the Hessian does not vanish and hence it cannot be disregarded during training. This motivates the adoption of second-order optimization methods. We explore the convergence guarantees of such methods in both linear and nonlinear cases, addressing challenges such as spectral bias and slow convergence. Every theoretical result is supported by numerical examples with both linear and nonlinear PDEs, and we highlight the benefits of second-order methods in benchmark test cases.
Related papers
- Controlled Learning of Pointwise Nonlinearities in Neural-Network-Like Architectures [14.93489065234423]
We present a general variational framework for the training of freeform nonlinearities in layered computational architectures.
The slope constraints allow us to impose properties such as 1-Lipschitz stability, firm non-expansiveness, and monotonicity/invertibility.
We show how to solve the numerically function-optimization problem by representing the nonlinearities in a suitable (nonuniform) B-spline basis.
arXiv Detail & Related papers (2024-08-23T14:39:27Z) - Quantum Algorithms for Nonlinear Dynamics: Revisiting Carleman Linearization with No Dissipative Conditions [0.7373617024876725]
We explore the embedding of nonlinear dynamical systems into linear ordinary differential equations (ODEs) via the Carleman linearization method.
Our analysis extends these findings by exploring error bounds beyond the traditional dissipative condition.
We prove how this resonance condition leads to a linear convergence with respect to the truncation level $N$ in Carleman linearization.
arXiv Detail & Related papers (2024-05-21T12:09:34Z) - Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.
DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Neural Ordinary Differential Equations for Nonlinear System
Identification [0.9864260997723973]
We present a study comparing NODE's performance against neural state-space models and classical linear system identification methods.
Experiments show that NODEs can consistently improve the prediction accuracy by an order of magnitude compared to benchmark methods.
arXiv Detail & Related papers (2022-02-28T22:25:53Z) - On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes.
We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z) - Learning Nonlinear Waves in Plasmon-induced Transparency [0.0]
We consider a recurrent neural network (RNN) approach to predict the complex propagation of nonlinear solitons in plasmon-induced transparency metamaterial systems.
We prove the prominent agreement of results in simulation and prediction by long short-term memory (LSTM) artificial neural networks.
arXiv Detail & Related papers (2021-07-31T21:21:44Z) - Inverse Problem of Nonlinear Schr\"odinger Equation as Learning of
Convolutional Neural Network [5.676923179244324]
It is shown that one can obtain a relatively accurate estimate of the considered parameters using the proposed method.
It provides a natural framework in inverse problems of partial differential equations with deep learning.
arXiv Detail & Related papers (2021-07-19T02:54:37Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z) - Learning Fast Approximations of Sparse Nonlinear Regression [50.00693981886832]
In this work, we bridge the gap by introducing the Threshold Learned Iterative Shrinkage Algorithming (NLISTA)
Experiments on synthetic data corroborate our theoretical results and show our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-26T11:31:08Z) - A Generalized Neural Tangent Kernel Analysis for Two-layer Neural
Networks [87.23360438947114]
We show that noisy gradient descent with weight decay can still exhibit a " Kernel-like" behavior.
This implies that the training loss converges linearly up to a certain accuracy.
We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay.
arXiv Detail & Related papers (2020-02-10T18:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.