Related papers: Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

URL: http://arxiv.org/abs/2405.11451v1
Date: Sun, 19 May 2024 05:07:09 GMT
Title: Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method
Authors: Yuling Jiao, Yanming Lai, Yang Wang,
Abstract summary: We employ a three-layer tanh neural network within the framework of the deep Ritz method to solve second-order elliptic equations. We perform projected gradient descent to train the three-layer network and we establish its global convergence. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm.
Score: 7.723218675113336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations with three different types of boundary conditions. We perform projected gradient descent(PDG) to train the three-layer network and we establish its global convergence. To the best of our knowledge, we are the first to provide a comprehensive error analysis of using overparameterized networks to solve PDE problems, as our analysis simultaneously includes estimates for approximation error, generalization error, and optimization error. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm. Importantly, our assumptions in this work are classical and we do not require any additional assumptions on the solution of the equation. This ensures the broad applicability and generality of our results.

Related papers

General-Kindred Physics-Informed Neural Network to the Solutions of Singularly Perturbed Differential Equations [11.121415128908566]
We propose the General-Kindred Physics-Informed Neural Network (GKPINN) for solving Singular Perturbation Differential Equations (SPDEs) This approach utilizes prior knowledge of the boundary layer from the equation and establishes a novel network to assist PINN in approxing the boundary layer. The research findings underscore the exceptional performance of our novel approach, GKPINN, which delivers a remarkable enhancement in reducing the $L$ error by two to four orders of magnitude compared to the established PINN methodology.
arXiv Detail & Related papers (2024-08-27T02:03:22Z)
A forward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations [0.6040014326756179]
We present a novel forward differential deep learning-based algorithm for solving high-dimensional nonlinear backward differential equations (BSDEs) Motivated by the fact that differential deep learning can efficiently approximate the labels and their derivatives with respect to inputs, we transform the BSDE problem into a differential deep learning problem. The main idea of our algorithm is to discretize the integrals using the Euler-Maruyama method and approximate the unknown discrete solution triple using three deep neural networks.
arXiv Detail & Related papers (2024-08-10T19:34:03Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Neural Basis Functions for Accelerating Solutions to High Mach Euler Equations [63.8376359764052]
We propose an approach to solving partial differential equations (PDEs) using a set of neural networks. We regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis. These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE.
arXiv Detail & Related papers (2022-08-02T18:27:13Z)
Solving parametric partial differential equations with deep rectified quadratic unit neural networks [38.16617079681564]
In this study, we investigate the expressive power of deep rectified quadratic unit (ReQU) neural networks for approximating the solution maps of parametric PDEs. We derive an upper bound $mathcalOleft(d3log_2qlog_2 (1/ epsilon) right)$ on the size of the deep ReQU neural network required to achieve accuracy.
arXiv Detail & Related papers (2022-03-14T10:15:29Z)
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent. We show that SGD is biased towards a simple solution. We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z)
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent [95.94432031144716]
We propose a unified non- optimization framework for the analysis of a learning network. We show that existing guarantees can be trained unified through gradient descent.
arXiv Detail & Related papers (2021-06-25T17:45:00Z)
Solving PDEs on Unknown Manifolds with Machine Learning [8.220217498103315]
This paper presents a mesh-free computational framework and machine learning theory for solving elliptic PDEs on unknown manifold. We show that the proposed NN solver can robustly generalize the PDE on new data points with errors that are almost identical to generalizations on new data points.
arXiv Detail & Related papers (2021-06-12T03:55:15Z)
ODEN: A Framework to Solve Ordinary Differential Equations using Artificial Neural Networks [0.0]
We prove a specific loss function, which does not require knowledge of the exact solution, to evaluate neural networks' performance. Neural networks are shown to be proficient at approximating continuous solutions within their training domains. A user-friendly and adaptable open-source code (ODE$mathcalN$) is provided on GitHub.
arXiv Detail & Related papers (2020-05-28T15:34:10Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.