Related papers: Understanding the Difficulty of Solving Cauchy Problems with PINNs

Understanding the Difficulty of Solving Cauchy Problems with PINNs

URL: http://arxiv.org/abs/2405.02561v2
Date: Tue, 18 Jun 2024 07:33:29 GMT
Title: Understanding the Difficulty of Solving Cauchy Problems with PINNs
Authors: Tao Wang, Bo Zhao, Sicun Gao, Rose Yu,
Abstract summary: PINNs often fail to achieve the same level of accuracy as classical methods in solving differential equations. We show that minimizing the sum of $L2$ residual and initial condition error is not sufficient to guarantee the true solution. We demonstrate that when the global minimum does not exist, machine precision becomes the predominant source of error.
Score: 31.98081858215356
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Physics-Informed Neural Networks (PINNs) have gained popularity in scientific computing in recent years. However, they often fail to achieve the same level of accuracy as classical methods in solving differential equations. In this paper, we identify two sources of this issue in the case of Cauchy problems: the use of $L^2$ residuals as objective functions and the approximation gap of neural networks. We show that minimizing the sum of $L^2$ residual and initial condition error is not sufficient to guarantee the true solution, as this loss function does not capture the underlying dynamics. Additionally, neural networks are not capable of capturing singularities in the solutions due to the non-compactness of their image sets. This, in turn, influences the existence of global minima and the regularity of the network. We demonstrate that when the global minimum does not exist, machine precision becomes the predominant source of achievable error in practice. We also present numerical experiments in support of our theoretical claims.

Related papers

Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon [22.29950158991071]
We study the implicit bias of flatness / low (loss) curvature and its effects on generalization in ReLU networks.<n>We show that while flatness does imply generalization, the resulting rates of convergence necessarily deteriorate exponentially as the input dimension grows.
arXiv Detail & Related papers (2025-06-25T19:10:03Z)
A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff [57.25901375384457]
We propose a nonasymptotic generalization theory for multilayer neural networks with arbitrary Lipschitz activations and general Lipschitz loss functions. In particular, it doens't require the boundness of loss function, as commonly assumed in the literature. We show the near minimax optimality of our theory for multilayer ReLU networks for regression problems.
arXiv Detail & Related papers (2025-03-03T23:34:12Z)
Improved Physics-informed neural networks loss function regularization with a variance-based term [2.238153450480258]
In machine learning and statistical modeling, the mean square or absolute error is commonly used as an error metric, also called a "loss function" We propose a novel loss function that combines the mean and the standard deviation of the chosen error metric. Results demonstrate improved solution quality and lower maximum error compared to the standard mean-based loss.
arXiv Detail & Related papers (2024-12-18T16:11:45Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
Dimension-independent learning rates for high-dimensional classification problems [53.622581586464634]
We show that every $RBV2$ function can be approximated by a neural network with bounded weights. We then prove the existence of a neural network with bounded weights approximating a classification function.
arXiv Detail & Related papers (2024-09-26T16:02:13Z)
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks [53.95175206863992]
We study the type of solutions to which gradient descent converges when used to train a single hidden-layer multivariate ReLU network with the quadratic loss. We prove that although shallow ReLU networks are universal approximators, stable shallow networks are not.
arXiv Detail & Related papers (2023-06-30T09:17:39Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Physics-Informed Neural Networks for Quantum Eigenvalue Problems [1.2891210250935146]
Eigenvalue problems are critical to several fields of science and engineering. We use unsupervised neural networks for discovering eigenfunctions and eigenvalues for differential eigenvalue problems. The network optimization is data-free and depends solely on the predictions of the neural network.
arXiv Detail & Related papers (2022-02-24T18:29:39Z)
On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems [0.0]
We study the loss landscape of training problems for deep artificial neural networks with a one-dimensional real output. It is shown that such problems possess a continuum of spurious (i.e., not globally optimal) local minima for all target functions that are not affine.
arXiv Detail & Related papers (2022-02-23T14:41:54Z)
Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks [19.216784367141972]
We study the problem of estimating an unknown function from noisy data using shallow (single-hidden layer) ReLU neural networks. We quantify the performance of these neural network estimators when the data-generating function belongs to the space of functions of second-order bounded variation in the Radon domain.
arXiv Detail & Related papers (2021-09-18T05:56:06Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Achieving Small Test Error in Mildly Overparameterized Neural Networks [30.664282759625948]
We show an algorithm which finds one of these points in time. In addition, we prove that for a fully connected neural net, with an additional assumption on the data distribution, there is a time algorithm.
arXiv Detail & Related papers (2021-04-24T06:47:20Z)
Conditional physics informed neural networks [85.48030573849712]
We introduce conditional PINNs (physics informed neural networks) for estimating the solution of classes of eigenvalue problems. We show that a single deep neural network can learn the solution of partial differential equations for an entire class of problems.
arXiv Detail & Related papers (2021-04-06T18:29:14Z)
Error Estimation and Correction from within Neural Network Differential Equation Solvers [3.04585143845864]
We describe a strategy for constructing error estimates and corrections for Neural Network Differential Equation solvers. Our methods do not require advance knowledge of the true solutions and obtain explicit relationships between loss functions and the error associated with solution estimates.
arXiv Detail & Related papers (2020-07-09T11:01:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.