Learning linear operators: Infinite-dimensional regression as a well-behaved non-compact inverse problem
- URL: http://arxiv.org/abs/2211.08875v3
- Date: Wed, 10 Jul 2024 12:08:27 GMT
- Title: Learning linear operators: Infinite-dimensional regression as a well-behaved non-compact inverse problem
- Authors: Mattes Mollenhauer, Nicole Mücke, T. J. Sullivan,
- Abstract summary: We consider the problem of learning a linear operator $theta$ between two Hilbert spaces from empirical observations.
We show that this goal can be reformulated as an inverse problem for $theta$ with the feature that its forward operator is generally non-compact.
We prove that this inverse problem is equivalent to the known compact inverse problem associated with scalar response regression derivation.
- Score: 4.503368323711748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of learning a linear operator $\theta$ between two Hilbert spaces from empirical observations, which we interpret as least squares regression in infinite dimensions. We show that this goal can be reformulated as an inverse problem for $\theta$ with the feature that its forward operator is generally non-compact (even if $\theta$ is assumed to be compact or of $p$-Schatten class). However, we prove that, in terms of spectral properties and regularisation theory, this inverse problem is equivalent to the known compact inverse problem associated with scalar response regression. Our framework allows for the elegant derivation of dimension-free rates for generic learning algorithms under H\"older-type source conditions. The proofs rely on the combination of techniques from kernel regression with recent results on concentration of measure for sub-exponential Hilbertian random variables. The obtained rates hold for a variety of practically-relevant scenarios in functional regression as well as nonlinear regression with operator-valued kernels and match those of classical kernel regression with scalar response.
Related papers
- Highly Adaptive Ridge [84.38107748875144]
We propose a regression method that achieves a $n-2/3$ dimension-free L2 convergence rate in the class of right-continuous functions with square-integrable sectional derivatives.
Har is exactly kernel ridge regression with a specific data-adaptive kernel based on a saturated zero-order tensor-product spline basis expansion.
We demonstrate empirical performance better than state-of-the-art algorithms for small datasets in particular.
arXiv Detail & Related papers (2024-10-03T17:06:06Z) - Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.
We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Towards Optimal Sobolev Norm Rates for the Vector-Valued Regularized Least-Squares Algorithm [30.08981916090924]
We present the first optimal rates for infinite-dimensional vector-valued ridge regression on a continuous scale of norms that interpolate between $L$ and the hypothesis space.
We show that these rates are optimal in most cases and independent of the dimension of the output space.
arXiv Detail & Related papers (2023-12-12T11:48:56Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Dimension free ridge regression [10.434481202633458]
We revisit ridge regression on i.i.d. data in terms of the bias and variance of ridge regression in terms of the bias and variance of an equivalent' sequence model.
As a new application, we obtain a completely explicit and sharp characterization of ridge regression for Hilbert covariates with regularly varying spectrum.
arXiv Detail & Related papers (2022-10-16T16:01:05Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - Robust Kernel-based Distribution Regression [13.426195476348955]
We study distribution regression (DR) which involves two stages of sampling, and aims at regressing from probability measures to real-valued responses over a kernel reproducing Hilbert space (RKHS)
By introducing a robust loss function $l_sigma$ for two-stage sampling problems, we present a novel robust distribution regression (RDR) scheme.
arXiv Detail & Related papers (2021-04-21T17:03:46Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Early stopping and polynomial smoothing in regression with reproducing kernels [2.0411082897313984]
We study the problem of early stopping for iterative learning algorithms in a reproducing kernel Hilbert space (RKHS)
We present a data-driven rule to perform early stopping without a validation set that is based on the so-called minimum discrepancy principle.
The proposed rule is proved to be minimax-optimal over different types of kernel spaces.
arXiv Detail & Related papers (2020-07-14T05:27:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.