Kernel Packet: An Exact and Scalable Algorithm for Gaussian Process
Regression with Mat\'ern Correlations
- URL: http://arxiv.org/abs/2203.03116v2
- Date: Wed, 9 Mar 2022 23:09:34 GMT
- Title: Kernel Packet: An Exact and Scalable Algorithm for Gaussian Process
Regression with Mat\'ern Correlations
- Authors: Haoyuan Chen, Liang Ding, Rui Tuo
- Abstract summary: We develop an exact and scalable algorithm for one-dimensional Gaussian process regression with Mat'ern correlations.
The proposed algorithm is significantly superior to the existing alternatives in both the computational time and predictive accuracy.
- Score: 23.560067934682294
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop an exact and scalable algorithm for one-dimensional Gaussian
process regression with Mat\'ern correlations whose smoothness parameter $\nu$
is a half-integer. The proposed algorithm only requires $\mathcal{O}(\nu^3 n)$
operations and $\mathcal{O}(\nu n)$ storage. This leads to a linear-cost solver
since $\nu$ is chosen to be fixed and usually very small in most applications.
The proposed method can be applied to multi-dimensional problems if a full grid
or a sparse grid design is used. The proposed method is based on a novel theory
for Mat\'ern correlation functions. We find that a suitable rearrangement of
these correlation functions can produce a compactly supported function, called
a "kernel packet". Using a set of kernel packets as basis functions leads to a
sparse representation of the covariance matrix that results in the proposed
algorithm. Simulation studies show that the proposed algorithm, when
applicable, is significantly superior to the existing alternatives in both the
computational time and predictive accuracy.
Related papers
- Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method [10.742859956268655]
This paper aims to recover the intrinsic model parameters given the leverage scores gradient.
We specifically scrutinize the inversion of the leverage score gradient, denoted as $g(x)$.
arXiv Detail & Related papers (2024-08-21T01:39:42Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Optimal Gradient Sliding and its Application to Distributed Optimization
Under Similarity [121.83085611327654]
We structured convex optimization problems with additive objective $r:=p + q$, where $r$ is $mu$-strong convex similarity.
We proposed a method to solve problems master to agents' communication and local calls.
The proposed method is much sharper than the $mathcalO(sqrtL_q/mu)$ method.
arXiv Detail & Related papers (2022-05-30T14:28:02Z) - PROMPT: Parallel Iterative Algorithm for $\ell_{p}$ norm linear
regression via Majorization Minimization with an application to
semi-supervised graph learning [0.0]
We consider the problem of $ell_p$ norm linear regression, which has several applications such as in sparse recovery, data clustering, and semi-supervised learning.
We propose an iterative algorithm : Parallel IteRative AlgOrithM for $ell_P$ norm regression via MajorizaTion Minimization (PROMPT)
arXiv Detail & Related papers (2021-10-23T10:19:11Z) - Hutch++: Optimal Stochastic Trace Estimation [75.45968495410048]
We introduce a new randomized algorithm, Hutch++, which computes a $(1 pm epsilon)$ approximation to $tr(A)$ for any positive semidefinite (PSD) $A$.
We show that it significantly outperforms Hutchinson's method in experiments.
arXiv Detail & Related papers (2020-10-19T16:45:37Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z) - Efficient algorithms for multivariate shape-constrained convex
regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints.
For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z) - Learning Sparse Classifiers: Continuous and Mixed Integer Optimization
Perspectives [10.291482850329892]
Mixed integer programming (MIP) can be used to solve (to optimality) $ell_0$-regularized regression problems.
We propose two classes of scalable algorithms: an exact algorithm that can handlepapprox 50,000$ features in a few minutes, and approximate algorithms that can address instances with $papprox6$.
In addition, we present new estimation error bounds for $ell$-regularizeds.
arXiv Detail & Related papers (2020-01-17T18:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.