Scalable Gaussian-process regression and variable selection using
Vecchia approximations
- URL:
- Date: Fri, 25 Feb 2022 21:22:38 GMT
- Title: Scalable Gaussian-process regression and variable selection using
Vecchia approximations
- Authors: Jian Cao, Joseph Guinness, Marc G. Genton, Matthias Katzfuss
- Abstract summary: Vecchia-based mini-batch subsampling provides unbiased gradient estimators.
We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators.
- Score: 3.4163060063961255
- License:
- Abstract: Gaussian process (GP) regression is a flexible, nonparametric approach to
regression that naturally quantifies uncertainty. In many applications, the
number of responses and covariates are both large, and a goal is to select
covariates that are related to the response. For this setting, we propose a
novel, scalable algorithm, coined VGPR, which optimizes a penalized GP
log-likelihood based on the Vecchia GP approximation, an ordered conditional
approximation from spatial statistics that implies a sparse Cholesky factor of
the precision matrix. We traverse the regularization path from strong to weak
penalization, sequentially adding candidate covariates based on the gradient of
the log-likelihood and deselecting irrelevant covariates via a new quadratic
constrained coordinate descent algorithm. We propose Vecchia-based mini-batch
subsampling, which provides unbiased gradient estimators. The resulting
procedure is scalable to millions of responses and thousands of covariates.
Theoretical analysis and numerical studies demonstrate the improved scalability
and accuracy relative to existing methods.
Related papers
- Robust Stochastic Optimization via Gradient Quantile Clipping [6.2844649973308835]
We introduce a quant clipping strategy for Gradient Descent (SGD)
We use gradient new outliers as norm clipping chains.
We propose an implementation of the algorithm using Huberiles.
arXiv Detail & Related papers (2023-09-29T15:24:48Z) - Variational sparse inverse Cholesky approximation for latent Gaussian
processes via double Kullback-Leibler minimization [6.012173616364571]
We combine a variational approximation of the posterior with a similar and efficient SIC-restricted Kullback-Leibler-optimal approximation of the prior.
For this setting, our variational approximation can be computed via gradient descent in polylogarithmic time per iteration.
We provide numerical comparisons showing that the proposed double-Kullback-Leibler-optimal Gaussian-process approximation (DKLGP) can sometimes be vastly more accurate for stationary kernels than alternative approaches.
arXiv Detail & Related papers (2023-01-30T21:50:08Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Robust Regression Revisited: Acceleration and Improved Estimation Rates [25.54653340884806]
We study fast algorithms for statistical regression problems under the strong contamination model.
The goal is to approximately optimize a generalized linear model (GLM) given adversarially corrupted samples.
We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees.
arXiv Detail & Related papers (2021-06-22T17:21:56Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Robust regression with covariate filtering: Heavy tails and adversarial
contamination [6.939768185086755]
We show how to modify the Huber regression, least trimmed squares, and least absolute deviation estimators to obtain estimators simultaneously computationally and statistically efficient in the stronger contamination model.
We show that the Huber regression estimator achieves near-optimal error rates in this setting, whereas the least trimmed squares and least absolute deviation estimators can be made to achieve near-optimal error after applying a postprocessing step.
arXiv Detail & Related papers (2020-09-27T22:48:48Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - Robust Gaussian Process Regression with a Bias Model [0.6850683267295248]
Most existing approaches replace an outlier-prone Gaussian likelihood with a non-Gaussian likelihood induced from a heavy tail distribution.
The proposed approach models an outlier as a noisy and biased observation of an unknown regression function.
Conditioned on the bias estimates, the robust GP regression can be reduced to a standard GP regression problem.
arXiv Detail & Related papers (2020-01-14T06:21:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.