Fast Statistical Leverage Score Approximation in Kernel Ridge Regression
- URL: http://arxiv.org/abs/2103.05238v1
- Date: Tue, 9 Mar 2021 05:57:08 GMT
- Title: Fast Statistical Leverage Score Approximation in Kernel Ridge Regression
- Authors: Yifan Chen, Yun Yang
- Abstract summary: Nystr"om approximation is a fast randomized method that rapidly solves kernel ridge regression (KRR) problems.
We propose a linear time (modulo poly-log terms) algorithm to accurately approximate the statistical leverage scores in the stationary- Kernel-based KRR.
- Score: 12.258887270632869
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nystr\"om approximation is a fast randomized method that rapidly solves
kernel ridge regression (KRR) problems through sub-sampling the n-by-n
empirical kernel matrix appearing in the objective function. However, the
performance of such a sub-sampling method heavily relies on correctly
estimating the statistical leverage scores for forming the sampling
distribution, which can be as costly as solving the original KRR. In this work,
we propose a linear time (modulo poly-log terms) algorithm to accurately
approximate the statistical leverage scores in the stationary-kernel-based KRR
with theoretical guarantees. Particularly, by analyzing the first-order
condition of the KRR objective, we derive an analytic formula, which depends on
both the input distribution and the spectral density of stationary kernels, for
capturing the non-uniformity of the statistical leverage scores. Numerical
experiments demonstrate that with the same prediction accuracy our method is
orders of magnitude more efficient than existing methods in selecting the
representative sub-samples in the Nystr\"om approximation.
Related papers
- Statistical Inference in Tensor Completion: Optimal Uncertainty Quantification and Statistical-to-Computational Gaps [7.174572371800217]
This paper presents a simple yet efficient method for statistical inference of tensor linear forms using incomplete and noisy observations.
It is suitable for various statistical inference tasks, including constructing confidence intervals, inference under heteroskedastic and sub-exponential noise, and simultaneous testing.
arXiv Detail & Related papers (2024-10-15T03:09:52Z) - Inference in Randomized Least Squares and PCA via Normality of Quadratic Forms [19.616162116973637]
We develop a unified methodology for statistical inference via randomized sketching or projections.
The methodology applies to fixed datasets -- i.e., is data-conditional -- and the only randomness is due to the randomized algorithm.
arXiv Detail & Related papers (2024-04-01T04:35:44Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Coefficient-based Regularized Distribution Regression [4.21768682940933]
We consider the coefficient-based regularized distribution regression which aims to regress from probability measures to real-valued responses over a kernel reproducing Hilbert space (RKHS)
Asymptotic behaviors of the algorithm in different regularity ranges of the regression function are comprehensively studied.
We get the optimal rates under some mild conditions, which matches the one-stage sampled minimax optimal rate.
arXiv Detail & Related papers (2022-08-26T03:46:14Z) - Probabilistic Estimation of Chirp Instantaneous Frequency Using Gaussian
Processes [4.150253997298207]
We present a probabilistic approach for estimating signal and its instantaneous frequency function when the true forms of the chirp and instantaneous frequency are unknown.
Experiments show that the method outperforms a number of baseline methods on a synthetic model, and we also apply it to analyse a gravitational wave data.
arXiv Detail & Related papers (2022-05-12T18:47:13Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective.
We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z) - Diversity sampling is an implicit regularization for kernel methods [13.136143245702915]
We show that Nystr"om kernel regression with diverse landmarks increases the accuracy of the regression in sparser regions of the dataset.
A greedy is also proposed to select diverse samples of significant size within large datasets when exact DPP sampling is not practically feasible.
arXiv Detail & Related papers (2020-02-20T08:24:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.