Related papers: Robust, randomized preconditioning for kernel ridge regression

Robust, randomized preconditioning for kernel ridge regression

URL: http://arxiv.org/abs/2304.12465v4
Date: Wed, 10 Jul 2024 19:46:15 GMT
Title: Robust, randomized preconditioning for kernel ridge regression
Authors: Mateo Díaz, Ethan N. Epperly, Zachary Frangella, Joel A. Tropp, Robert J. Webber,
Abstract summary: This paper investigates two randomized preconditioning techniques for solving kernel ridge regression problems. It introduces two new methods with state-of-the-art performance. The proposed methods solve a broad range of KRR problems, making them ideal for practical applications.
Score: 3.521877014965197
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper investigates two randomized preconditioning techniques for solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 \leq N \leq 10^7$), and it introduces two new methods with state-of-the-art performance. The first method, RPCholesky preconditioning, accurately solves the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k \ll N$ selected data centers at a cost of $O((N + k^2) k \log k)$ operations. The proposed methods solve a broad range of KRR problems, making them ideal for practical applications.

Related papers

Have ASkotch: A Neat Solution for Large-scale Kernel Ridge Regression [16.836685923503868]
ASkotch is a scalable, accelerated, iterative method for full KRR that provably obtains linear convergence. ASkotch outperforms state-of-the-art KRR solvers on a testbed of 23 large-scale KRR regression and classification tasks. Our work opens up the possibility of as-yet-unimagined applications of full KRR across a number of disciplines.
arXiv Detail & Related papers (2024-07-14T04:11:10Z)
Near-Optimal Algorithms for Gaussians with Huber Contamination: Mean Estimation and Linear Regression [44.13655983242414]
We design the first sample near- and almost linear-time algorithms with optimal error guarantees. For robust linear regression, we give the first algorithm with sample complexity $n = tildeO(d/epsilon2)$ and almost linear runtime that approximates the target regressor within $ell$- $O(epsilon)$. This is the first sample and time algorithm achieving the optimal error guarantee, answering an open question in the literature.
arXiv Detail & Related papers (2023-12-04T00:31:16Z)
A Novel Normalized-Cut Solver with Nearest Neighbor Hierarchical Initialization [107.07093621337084]
Normalized-Cut (N-Cut) is a famous model of spectral clustering. Traditional N-Cut solvers are two-stage: 1) calculating the continuous spectral embedding of normalized Laplacian matrix; 2) discretization via $K$-means or spectral rotation. We propose a novel N-Cut solver based on the famous coordinate descent method.
arXiv Detail & Related papers (2023-11-26T07:11:58Z)
A quantum central path algorithm for linear optimization [5.450016817940232]
We propose a novel quantum algorithm for solving linear optimization problems by quantum-mechanical simulation of the central path. This approach yields an algorithm for solving linear optimization problems involving $m$ constraints and $n$ variables to $varepsilon$-optimality. In the standard gate model (i.e., without access to quantum RAM), our algorithm can obtain highly-precise solutions to LO problems using at most $$mathcalO left( sqrtm + n textsfnnz (A) fracR_1
arXiv Detail & Related papers (2023-11-07T13:26:20Z)
Oracle Complexity Reduction for Model-free LQR: A Stochastic Variance-Reduced Policy Gradient Approach [4.422315636150272]
We investigate the problem of learning an $epsilon$-approximate solution for the discrete-time Linear Quadratic Regulator (LQR) problem. Our method combines both one-point and two-point estimations in a dual-loop variance-reduced algorithm.
arXiv Detail & Related papers (2023-09-19T15:03:18Z)
Convergence analysis of online algorithms for vector-valued kernel regression [0.42970700836450487]
We consider the problem of approximating the regression function from noisy vector-valued data by an online learning algorithm. We show that the expected squared error in the RKHS norm can be bounded by $C2 (m+1)-s/(2+s)$, where $m$ is the current number of processed data.
arXiv Detail & Related papers (2023-09-14T15:10:47Z)
Learning "best" kernels from data in Gaussian process regression. With application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques. A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning. A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z)
Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems. We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z)
Conditional Uncorrelation and Efficient Non-approximate Subset Selection in Sparse Regression [72.84177488527398]
We consider sparse regression from the view of correlation, and propose the formula of conditional uncorrelation. By the proposed method, the computational complexity is reduced from $O(frac16k3+mk2+mkd)$ to $O(frac16k3+frac12mk2)$ for each candidate subset in sparse regression.
arXiv Detail & Related papers (2020-09-08T20:32:26Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems. Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.