A Bound on the Maximal Marginal Degrees of Freedom
- URL: http://arxiv.org/abs/2402.12885v2
- Date: Mon, 06 Jan 2025 12:33:24 GMT
- Title: A Bound on the Maximal Marginal Degrees of Freedom
- Authors: Paul Dommel,
- Abstract summary: This paper addresses low rank approximations and surrogates for kernel ridge regression.
We demonstrate that the computational cost of the most popular low rank approach, which is the Nystr"om method, is almost linear in the sample size.
The result builds on a thorough theoretical analysis of the approximation of elementary kernel functions by elements in the range of the associated integral operator.
- Score: 0.0
- License:
- Abstract: Kernel ridge regression, in general, is expensive in memory allocation and computation time. This paper addresses low rank approximations and surrogates for kernel ridge regression, which bridge these difficulties. The fundamental contribution of the paper is a lower bound on the minimal rank such that the prediction power of the approximation remains reliable. Based on this bound, we demonstrate that the computational cost of the most popular low rank approach, which is the Nystr\"om method, is almost linear in the sample size. This justifies the method from a theoretical point of view. Moreover, the paper provides a significant extension of the feasible choices of the regularization parameter. The result builds on a thorough theoretical analysis of the approximation of elementary kernel functions by elements in the range of the associated integral operator. We provide estimates of the approximation error and characterize the behavior of the norm of the underlying weight function.
Related papers
- Gaussian Process Regression under Computational and Epistemic Misspecification [4.5656369638728656]
In large data applications, computational costs can be reduced using low-rank or sparse approximations of the kernel.
This paper investigates the effect of such kernel approximations on the element error.
arXiv Detail & Related papers (2023-12-14T18:53:32Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Statistical Optimality of Divide and Conquer Kernel-based Functional
Linear Regression [1.7227952883644062]
This paper studies the convergence performance of divide-and-conquer estimators in the scenario that the target function does not reside in the underlying kernel space.
As a decomposition-based scalable approach, the divide-and-conquer estimators of functional linear regression can substantially reduce the algorithmic complexities in time and memory.
arXiv Detail & Related papers (2022-11-20T12:29:06Z) - Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method.
It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate.
We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - Minimax Semiparametric Learning With Approximate Sparsity [3.2116198597240846]
This paper is about the feasibility and means of root-n consistently estimating linear, mean-square continuous functionals of a high dimensional, approximately sparse regression.
We give lower bounds on the convergence rate of estimators of a regression slope and an average derivative.
We also give debiased machine learners that are root-n consistent under either a minimal approximate sparsity condition or rate double robustness.
arXiv Detail & Related papers (2019-12-27T16:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.