Dimension-independent learning rates for high-dimensional classification
problems
- URL: http://arxiv.org/abs/2409.17991v1
- Date: Thu, 26 Sep 2024 16:02:13 GMT
- Title: Dimension-independent learning rates for high-dimensional classification
problems
- Authors: Andres Felipe Lerma-Pineda, Philipp Petersen, Simon Frieder, Thomas
Lukasiewicz
- Abstract summary: We show that every $RBV2$ function can be approximated by a neural network with bounded weights.
We then prove the existence of a neural network with bounded weights approximating a classification function.
- Score: 53.622581586464634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of approximating and estimating classification functions
that have their decision boundary in the $RBV^2$ space. Functions of $RBV^2$
type arise naturally as solutions of regularized neural network learning
problems and neural networks can approximate these functions without the curse
of dimensionality. We modify existing results to show that every $RBV^2$
function can be approximated by a neural network with bounded weights.
Thereafter, we prove the existence of a neural network with bounded weights
approximating a classification function. And we leverage these bounds to
quantify the estimation rates. Finally, we present a numerical study that
analyzes the effect of different regularity conditions on the decision
boundaries.
Related papers
- A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Optimized classification with neural ODEs via separability [0.0]
Classification of $N$ points becomes a simultaneous control problem when viewed through the lens of neural ordinary differential equations (neural ODEs)
In this study, we focus on estimating the number of neurons required for efficient cluster-based classification.
We propose a new constructive algorithm that simultaneously classifies clusters of $d$ points from any initial configuration.
arXiv Detail & Related papers (2023-12-21T12:56:40Z) - Benign Overfitting for Two-layer ReLU Convolutional Neural Networks [60.19739010031304]
We establish algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise.
We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk.
arXiv Detail & Related papers (2023-03-07T18:59:38Z) - Optimal learning of high-dimensional classification problems using deep
neural networks [0.0]
We study the problem of learning classification functions from noiseless training samples, under the assumption that the decision boundary is of a certain regularity.
For the class of locally Barron-regular decision boundaries, we find that the optimal estimation rates are essentially independent of the underlying dimension.
arXiv Detail & Related papers (2021-12-23T14:15:10Z) - Sobolev-type embeddings for neural network approximation spaces [5.863264019032882]
We consider neural network approximation spaces that classify functions according to the rate at which they can be approximated.
We prove embedding theorems between these spaces for different values of $p$.
We find that, analogous to the case of classical function spaces, it is possible to trade "smoothness" (i.e., approximation rate) for increased integrability.
arXiv Detail & Related papers (2021-10-28T17:11:38Z) - Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks [19.216784367141972]
We study the problem of estimating an unknown function from noisy data using shallow (single-hidden layer) ReLU neural networks.
We quantify the performance of these neural network estimators when the data-generating function belongs to the space of functions of second-order bounded variation in the Radon domain.
arXiv Detail & Related papers (2021-09-18T05:56:06Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - The Rate of Convergence of Variation-Constrained Deep Neural Networks [35.393855471751756]
We show that a class of variation-constrained neural networks can achieve near-parametric rate $n-1/2+delta$ for an arbitrarily small constant $delta$.
The result indicates that the neural function space needed for approximating smooth functions may not be as large as what is often perceived.
arXiv Detail & Related papers (2021-06-22T21:28:00Z) - Conditional physics informed neural networks [85.48030573849712]
We introduce conditional PINNs (physics informed neural networks) for estimating the solution of classes of eigenvalue problems.
We show that a single deep neural network can learn the solution of partial differential equations for an entire class of problems.
arXiv Detail & Related papers (2021-04-06T18:29:14Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.