Hypothesis Spaces for Deep Learning
- URL: http://arxiv.org/abs/2403.03353v2
- Date: Mon, 11 Mar 2024 14:37:42 GMT
- Title: Hypothesis Spaces for Deep Learning
- Authors: Rui Wang, Yuesheng Xu, Mingsong Yan
- Abstract summary: This paper introduces a hypothesis space for deep learning that employs deep neural networks (DNNs)
By treating a DNN as a function of two variables, we consider the primitive set of the DNNs for the parameter variable located in a set of the weight matrices and biases determined by a prescribed depth and widths of the DNNs.
We prove that the Banach space so constructed is a kernel reproducing Banach space (RKBS) and construct its reproducing kernel.
- Score: 7.695772976072261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a hypothesis space for deep learning that employs deep
neural networks (DNNs). By treating a DNN as a function of two variables, the
physical variable and parameter variable, we consider the primitive set of the
DNNs for the parameter variable located in a set of the weight matrices and
biases determined by a prescribed depth and widths of the DNNs. We then
complete the linear span of the primitive DNN set in a weak* topology to
construct a Banach space of functions of the physical variable. We prove that
the Banach space so constructed is a reproducing kernel Banach space (RKBS) and
construct its reproducing kernel. We investigate two learning models,
regularized learning and minimum interpolation problem in the resulting RKBS,
by establishing representer theorems for solutions of the learning models. The
representer theorems unfold that solutions of these learning models can be
expressed as linear combination of a finite number of kernel sessions
determined by given data and the reproducing kernel.
Related papers
- Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Neural reproducing kernel Banach spaces and representer theorems for
deep networks [16.279502878600184]
We show that deep neural networks define suitable reproducing kernel Banach spaces.
We derive representer theorems that justify the finite architectures commonly employed in applications.
arXiv Detail & Related papers (2024-03-13T17:51:02Z) - Sparse Representer Theorems for Learning in Reproducing Kernel Banach
Spaces [7.695772976072261]
Sparsity of a learning solution is a desirable feature in machine learning.
Certain reproducing kernel Banach spaces (RKBSs) are appropriate hypothesis spaces for sparse learning methods.
arXiv Detail & Related papers (2023-05-21T22:36:32Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Designing Universal Causal Deep Learning Models: The Case of
Infinite-Dimensional Dynamical Systems from Stochastic Analysis [3.5450828190071655]
Causal operators (COs) play a central role in contemporary analysis.
There is still no canonical framework for designing Deep Learning (DL) models capable of approximating COs.
This paper proposes a "geometry-aware" solution to this open problem by introducing a DL model-design framework.
arXiv Detail & Related papers (2022-10-24T14:43:03Z) - NeuralEF: Deconstructing Kernels by Deep Neural Networks [47.54733625351363]
Traditional nonparametric solutions based on the Nystr"om formula suffer from scalability issues.
Recent work has resorted to a parametric approach, i.e., training neural networks to approximate the eigenfunctions.
We show that these problems can be fixed by using a new series of objective functions that generalizes to space of supervised and unsupervised learning problems.
arXiv Detail & Related papers (2022-04-30T05:31:07Z) - Analysis of Regularized Learning for Linear-functional Data in Banach
Spaces [3.160070867400839]
We study the whole theory of regularized learning for linear-functional data in Banach spaces.
We show the convergence of the approximate solutions to the exact solutions by the weak* topology of the Banach space.
The theorems of the regularized learning are applied to solve many problems of machine learning.
arXiv Detail & Related papers (2021-09-07T15:51:12Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.