Advantage of Deep Neural Networks for Estimating Functions with
Singularity on Hypersurfaces
- URL: http://arxiv.org/abs/2011.02256v2
- Date: Tue, 8 Feb 2022 17:38:08 GMT
- Title: Advantage of Deep Neural Networks for Estimating Functions with
Singularity on Hypersurfaces
- Authors: Masaaki Imaizumi, Kenji Fukumizu
- Abstract summary: We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods.
This study tries to fill this gap by considering the estimation for a class of non-smooth functions that have singularities on hypersurfaces.
- Score: 23.21591478556582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a minimax rate analysis to describe the reason that deep neural
networks (DNNs) perform better than other standard methods. For nonparametric
regression problems, it is well known that many standard methods attain the
minimax optimal rate of estimation errors for smooth functions, and thus, it is
not straightforward to identify the theoretical advantages of DNNs. This study
tries to fill this gap by considering the estimation for a class of non-smooth
functions that have singularities on hypersurfaces. Our findings are as
follows: (i) We derive the generalization error of a DNN estimator and prove
that its convergence rate is almost optimal. (ii) We elucidate a phase diagram
of estimation problems, which describes the situations where the DNNs
outperform a general class of estimators, including kernel methods, Gaussian
process methods, and others. We additionally show that DNNs outperform harmonic
analysis based estimators. This advantage of DNNs comes from the fact that a
shape of singularity can be successfully handled by their multi-layered
structure.
Related papers
- Efficient kernel surrogates for neural network-based regression [0.8030359871216615]
We study the performance of the Conjugate Kernel (CK), an efficient approximation to the Neural Tangent Kernel (NTK)
We show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior.
In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework suggests a recipe for improving DNN accuracy inexpensively.
arXiv Detail & Related papers (2023-10-28T06:41:47Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Non-Asymptotic Performance Guarantees for Neural Estimation of
$\mathsf{f}$-Divergences [22.496696555768846]
Statistical distances quantify the dissimilarity between probability distributions.
A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it.
This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs.
arXiv Detail & Related papers (2021-03-11T19:47:30Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - Optimization and Generalization Analysis of Transduction through
Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing.
Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem.
We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z) - Nonconvex sparse regularization for deep neural networks and its
optimality [1.9798034349981162]
Deep neural network (DNN) estimators can attain optimal convergence rates for regression and classification problems.
We propose a novel penalized estimation method for sparse DNNs.
We prove that the sparse-penalized estimator can adaptively attain minimax convergence rates for various nonparametric regression problems.
arXiv Detail & Related papers (2020-03-26T07:15:28Z) - Interval Neural Networks: Uncertainty Scores [11.74565957328407]
We propose a fast, non-Bayesian method for producing uncertainty scores in the output of pre-trained deep neural networks (DNNs)
This interval neural network (INN) has interval valued parameters and propagates its input using interval arithmetic.
In numerical experiments on an image reconstruction task, we demonstrate the practical utility of INNs as a proxy for the prediction error.
arXiv Detail & Related papers (2020-03-25T18:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.