Besov Function Approximation and Binary Classification on
Low-Dimensional Manifolds Using Convolutional Residual Networks
- URL: http://arxiv.org/abs/2109.02832v1
- Date: Tue, 7 Sep 2021 02:58:11 GMT
- Title: Besov Function Approximation and Binary Classification on
Low-Dimensional Manifolds Using Convolutional Residual Networks
- Authors: Hao Liu, Minshuo Chen, Tuo Zhao, Wenjing Liao
- Abstract summary: We establish theoretical guarantees of convolutional residual networks (ConvResNet) in terms of function approximation and statistical estimation for binary classification.
Our results demonstrate that ConvResNets are adaptive to low-dimensional structures of data sets.
- Score: 42.43493635899849
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of existing statistical theories on deep neural networks have sample
complexities cursed by the data dimension and therefore cannot well explain the
empirical success of deep learning on high-dimensional data. To bridge this
gap, we propose to exploit low-dimensional geometric structures of the real
world data sets. We establish theoretical guarantees of convolutional residual
networks (ConvResNet) in terms of function approximation and statistical
estimation for binary classification. Specifically, given the data lying on a
$d$-dimensional manifold isometrically embedded in $\mathbb{R}^D$, we prove
that if the network architecture is properly chosen, ConvResNets can (1)
approximate Besov functions on manifolds with arbitrary accuracy, and (2) learn
a classifier by minimizing the empirical logistic risk, which gives an excess
risk in the order of $n^{-\frac{s}{2s+2(s\vee d)}}$, where $s$ is a smoothness
parameter. This implies that the sample complexity depends on the intrinsic
dimension $d$, instead of the data dimension $D$. Our results demonstrate that
ConvResNets are adaptive to low-dimensional structures of data sets.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Sample Complexity of Neural Policy Mirror Descent for Policy
Optimization on Low-Dimensional Manifolds [75.51968172401394]
We study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN)
In each iteration of NPMD, both the value function and the policy can be well approximated by CNNs.
We show that NPMD can leverage the low-dimensional structure of state space to escape from the curse of dimensionality.
arXiv Detail & Related papers (2023-09-25T07:31:22Z) - Self-Supervised Scalable Deep Compressed Sensing [24.854496459622787]
Compressed sensing is a promising tool for reducing sampling costs.
Current deep neural network (NN)-based CS methods face the challenges of collecting labeled measurement-ground truth (GT) data.
This paper proposes a novel $mathbfS$elf-supervised s$mathbfC$alable deep CS method.
arXiv Detail & Related papers (2023-08-26T06:03:06Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Distributed Sparse Feature Selection in Communication-Restricted
Networks [6.9257380648471765]
We propose and theoretically analyze a new distributed scheme for sparse linear regression and feature selection.
In order to infer the causal dimensions from the whole dataset, we propose a simple, yet effective method for information sharing in the network.
arXiv Detail & Related papers (2021-11-02T05:02:24Z) - Theory of Deep Convolutional Neural Networks III: Approximating Radial
Functions [7.943024117353317]
We consider a family of deep neural networks consisting of two groups of convolutional layers, a down operator, and a fully connected layer.
The network structure depends on two structural parameters which determine the numbers of convolutional layers and the width of the fully connected layer.
arXiv Detail & Related papers (2021-07-02T08:22:12Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.