Related papers: Solving multiscale elliptic problems by sparse radial basis function neural networks

Solving multiscale elliptic problems by sparse radial basis function neural networks

URL: http://arxiv.org/abs/2309.03107v1
Date: Fri, 1 Sep 2023 15:11:34 GMT
Title: Solving multiscale elliptic problems by sparse radial basis function neural networks
Authors: Zhiwen Wang, Minxin Chen, Jingrun Chen
Abstract summary: We propose a sparse radial basis function neural network method to solve elliptic partial differential equations (PDEs) with multiscale coefficients. Inspired by the deep mixed residual method, we rewrite the second-order problem into a first-order system and employ multiple radial basis function neural networks (RBFNNs) to approximate unknown functions in the system. The accuracy and effectiveness of the proposed method are demonstrated through a collection of multiscale problems with scale separation, discontinuity and multiple scales from one to three dimensions.
Score: 3.5297361401370044
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning has been successfully applied to various fields of scientific computing in recent years. In this work, we propose a sparse radial basis function neural network method to solve elliptic partial differential equations (PDEs) with multiscale coefficients. Inspired by the deep mixed residual method, we rewrite the second-order problem into a first-order system and employ multiple radial basis function neural networks (RBFNNs) to approximate unknown functions in the system. To aviod the overfitting due to the simplicity of RBFNN, an additional regularization is introduced in the loss function. Thus the loss function contains two parts: the $L_2$ loss for the residual of the first-order system and boundary conditions, and the $\ell_1$ regularization term for the weights of radial basis functions (RBFs). An algorithm for optimizing the specific loss function is introduced to accelerate the training process. The accuracy and effectiveness of the proposed method are demonstrated through a collection of multiscale problems with scale separation, discontinuity and multiple scales from one to three dimensions. Notably, the $\ell_1$ regularization can achieve the goal of representing the solution by fewer RBFs. As a consequence, the total number of RBFs scales like $\mathcal{O}(\varepsilon^{-n\tau})$, where $\varepsilon$ is the smallest scale, $n$ is the dimensionality, and $\tau$ is typically smaller than $1$. It is worth mentioning that the proposed method not only has the numerical convergence and thus provides a reliable numerical solution in three dimensions when a classical method is typically not affordable, but also outperforms most other available machine learning methods in terms of accuracy and robustness.

Related papers

Bayesian Inference with Deep Weakly Nonlinear Networks [57.95116787699412]
We show at a physics level of rigor that Bayesian inference with a fully connected neural network is solvable. We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature.
arXiv Detail & Related papers (2024-05-26T17:08:04Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks. We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks. Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z)
A backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations [0.6040014326756179]
We propose a novel backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward differential equations. The deep neural network (DNN) models are trained not only on the inputs and labels but also the differentials of the corresponding labels.
arXiv Detail & Related papers (2024-04-12T13:05:35Z)
spred: Solving $L_1$ Penalty with SGD [6.2255027793924285]
We propose to minimize a differentiable objective with $L_$ using a simple reparametrization. We prove that the reparametrization trick is completely benign" with an exactiable non function.
arXiv Detail & Related papers (2022-10-03T20:07:51Z)
Solving parametric partial differential equations with deep rectified quadratic unit neural networks [38.16617079681564]
In this study, we investigate the expressive power of deep rectified quadratic unit (ReQU) neural networks for approximating the solution maps of parametric PDEs. We derive an upper bound $mathcalOleft(d3log_2qlog_2 (1/ epsilon) right)$ on the size of the deep ReQU neural network required to achieve accuracy.
arXiv Detail & Related papers (2022-03-14T10:15:29Z)
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements. We derive an information-theoretic lower bound for the minimax risk under this setting. We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z)
A Neural Network Ensemble Approach to System Identification [0.6445605125467573]
We present a new algorithm for learning unknown governing equations from trajectory data. We approximate the function $f$ using an ensemble of neural networks.
arXiv Detail & Related papers (2021-10-15T21:45:48Z)
Fundamental tradeoffs between memorization and robustness in random features and neural tangent regimes [15.76663241036412]
We prove for a large class of activation functions that, if the model memorizes even a fraction of the training, then its Sobolev-seminorm is lower-bounded. Experiments reveal for the first time, (iv) a multiple-descent phenomenon in the robustness of the min-norm interpolator.
arXiv Detail & Related papers (2021-06-04T17:52:50Z)
Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data. Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions. In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
A Corrective View of Neural Networks: Representation, Memorization and Learning [26.87238691716307]
We develop a corrective mechanism for neural network approximation. We show that two-layer neural networks in the random features regime (RF) can memorize arbitrary labels. We also consider three-layer neural networks and show that the corrective mechanism yields faster representation rates for smooth radial functions.
arXiv Detail & Related papers (2020-02-01T20:51:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.