Solving multiscale elliptic problems by sparse radial basis function
neural networks
- URL: http://arxiv.org/abs/2309.03107v1
- Date: Fri, 1 Sep 2023 15:11:34 GMT
- Title: Solving multiscale elliptic problems by sparse radial basis function
neural networks
- Authors: Zhiwen Wang, Minxin Chen, Jingrun Chen
- Abstract summary: We propose a sparse radial basis function neural network method to solve elliptic partial differential equations (PDEs) with multiscale coefficients.
Inspired by the deep mixed residual method, we rewrite the second-order problem into a first-order system and employ multiple radial basis function neural networks (RBFNNs) to approximate unknown functions in the system.
The accuracy and effectiveness of the proposed method are demonstrated through a collection of multiscale problems with scale separation, discontinuity and multiple scales from one to three dimensions.
- Score: 3.5297361401370044
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has been successfully applied to various fields of
scientific computing in recent years. In this work, we propose a sparse radial
basis function neural network method to solve elliptic partial differential
equations (PDEs) with multiscale coefficients. Inspired by the deep mixed
residual method, we rewrite the second-order problem into a first-order system
and employ multiple radial basis function neural networks (RBFNNs) to
approximate unknown functions in the system. To aviod the overfitting due to
the simplicity of RBFNN, an additional regularization is introduced in the loss
function. Thus the loss function contains two parts: the $L_2$ loss for the
residual of the first-order system and boundary conditions, and the $\ell_1$
regularization term for the weights of radial basis functions (RBFs). An
algorithm for optimizing the specific loss function is introduced to accelerate
the training process. The accuracy and effectiveness of the proposed method are
demonstrated through a collection of multiscale problems with scale separation,
discontinuity and multiple scales from one to three dimensions. Notably, the
$\ell_1$ regularization can achieve the goal of representing the solution by
fewer RBFs. As a consequence, the total number of RBFs scales like
$\mathcal{O}(\varepsilon^{-n\tau})$, where $\varepsilon$ is the smallest scale,
$n$ is the dimensionality, and $\tau$ is typically smaller than $1$. It is
worth mentioning that the proposed method not only has the numerical
convergence and thus provides a reliable numerical solution in three dimensions
when a classical method is typically not affordable, but also outperforms most
other available machine learning methods in terms of accuracy and robustness.
Related papers
- Bayesian Inference with Deep Weakly Nonlinear Networks [57.95116787699412]
We show at a physics level of rigor that Bayesian inference with a fully connected neural network is solvable.
We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature.
arXiv Detail & Related papers (2024-05-26T17:08:04Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - A backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations [0.6040014326756179]
We propose a novel backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward differential equations.
The deep neural network (DNN) models are trained not only on the inputs and labels but also the differentials of the corresponding labels.
arXiv Detail & Related papers (2024-04-12T13:05:35Z) - spred: Solving $L_1$ Penalty with SGD [6.2255027793924285]
We propose to minimize a differentiable objective with $L_$ using a simple reparametrization.
We prove that the reparametrization trick is completely benign" with an exactiable non function.
arXiv Detail & Related papers (2022-10-03T20:07:51Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - A Neural Network Ensemble Approach to System Identification [0.6445605125467573]
We present a new algorithm for learning unknown governing equations from trajectory data.
We approximate the function $f$ using an ensemble of neural networks.
arXiv Detail & Related papers (2021-10-15T21:45:48Z) - Fundamental tradeoffs between memorization and robustness in random
features and neural tangent regimes [15.76663241036412]
We prove for a large class of activation functions that, if the model memorizes even a fraction of the training, then its Sobolev-seminorm is lower-bounded.
Experiments reveal for the first time, (iv) a multiple-descent phenomenon in the robustness of the min-norm interpolator.
arXiv Detail & Related papers (2021-06-04T17:52:50Z) - Beyond Lazy Training for Over-parameterized Tensor Decomposition [69.4699995828506]
We show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
Our results show that gradient descent on over-parametrized objective could go beyond the lazy training regime and utilize certain low-rank structure in the data.
arXiv Detail & Related papers (2020-10-22T00:32:12Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z) - A Corrective View of Neural Networks: Representation, Memorization and
Learning [26.87238691716307]
We develop a corrective mechanism for neural network approximation.
We show that two-layer neural networks in the random features regime (RF) can memorize arbitrary labels.
We also consider three-layer neural networks and show that the corrective mechanism yields faster representation rates for smooth radial functions.
arXiv Detail & Related papers (2020-02-01T20:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.