Two-layer neural networks with values in a Banach space
- URL: http://arxiv.org/abs/2105.02095v1
- Date: Wed, 5 May 2021 14:54:24 GMT
- Title: Two-layer neural networks with values in a Banach space
- Authors: Yury Korolev
- Abstract summary: We study two-layer neural networks whose domain and range are Banach spaces with separable preduals.
As the nonlinearity we choose the lattice operation of taking the positive part; in case of $mathbb Rd$-valued neural networks this corresponds to the ReLU activation function.
- Score: 1.90365714903665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study two-layer neural networks whose domain and range are Banach spaces
with separable preduals. In addition, we assume that the image space is
equipped with a partial order, i.e. it is a Riesz space. As the nonlinearity we
choose the lattice operation of taking the positive part; in case of $\mathbb
R^d$-valued neural networks this corresponds to the ReLU activation function.
We prove inverse and direct approximation theorems with Monte-Carlo rates,
extending existing results for the finite-dimensional case. In the second part
of the paper, we consider training such networks using a finite amount of noisy
observations from the regularisation theory viewpoint. We discuss regularity
conditions known as source conditions and obtain convergence rates in a Bregman
distance in the regime when both the noise level goes to zero and the number of
samples goes to infinity at appropriate rates.
Related papers
- Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
Neural Networks [49.870593940818715]
We study the infinite-width limit of a type of three-layer NN model whose first layer is random and fixed.
Our theory accommodates different scaling choices of the model, resulting in two regimes of the MF limit that demonstrate distinctive behaviors.
arXiv Detail & Related papers (2022-10-28T17:26:27Z) - Tangent Bundle Filters and Neural Networks: from Manifolds to Cellular
Sheaves and Back [114.01902073621577]
We use the convolution to define tangent bundle filters and tangent bundle neural networks (TNNs)
We discretize TNNs both in space and time domains, showing that their discrete counterpart is a principled variant of the recently introduced Sheaf Neural Networks.
We numerically evaluate the effectiveness of the proposed architecture on a denoising task of a tangent vector field over the unit 2-sphere.
arXiv Detail & Related papers (2022-10-26T21:55:45Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Sobolev-type embeddings for neural network approximation spaces [5.863264019032882]
We consider neural network approximation spaces that classify functions according to the rate at which they can be approximated.
We prove embedding theorems between these spaces for different values of $p$.
We find that, analogous to the case of classical function spaces, it is possible to trade "smoothness" (i.e., approximation rate) for increased integrability.
arXiv Detail & Related papers (2021-10-28T17:11:38Z) - Linear approximability of two-layer neural networks: A comprehensive
analysis based on spectral decay [4.042159113348107]
We first consider the case of single neuron and show that the linear approximability, quantified by the Kolmogorov width, is controlled by the eigenvalue decay of an associate kernel.
We show that similar results also hold for two-layer neural networks.
arXiv Detail & Related papers (2021-08-10T23:30:29Z) - Large-width functional asymptotics for deep Gaussian neural networks [2.7561479348365734]
We consider fully connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions.
Our results contribute to recent theoretical studies on the interplay between infinitely wide deep neural networks and processes.
arXiv Detail & Related papers (2021-02-20T10:14:37Z) - Kolmogorov Width Decay and Poor Approximators in Machine Learning:
Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels [8.160343645537106]
We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space.
We show that reproducing kernel Hilbert spaces are poor $L2$-approximators for the class of two-layer neural networks in high dimension.
arXiv Detail & Related papers (2020-05-21T17:40:38Z) - A function space analysis of finite neural networks with insights from
sampling theory [41.07083436560303]
We show that the function space generated by multi-layer networks with non-expansive activation functions is smooth.
Under the assumption that the input is band-limited, we provide novel error bounds.
We analyze both deterministic uniform and random sampling showing the advantage of the former.
arXiv Detail & Related papers (2020-04-15T10:25:18Z) - Neural Networks are Convex Regularizers: Exact Polynomial-time Convex
Optimization Formulations for Two-layer Networks [70.15611146583068]
We develop exact representations of training two-layer neural networks with rectified linear units (ReLUs)
Our theory utilizes semi-infinite duality and minimum norm regularization.
arXiv Detail & Related papers (2020-02-24T21:32:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.