Related papers: Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck

Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck

URL: http://arxiv.org/abs/2004.11094v2
Date: Thu, 15 Jul 2021 15:49:51 GMT
Title: Consistent Online Gaussian Process Regression Without the Sample Complexity Bottleneck
Authors: Alec Koppel, Hrusikesha Pradhan, Ketan Rajawat
Abstract summary: We propose an online compression scheme that fixes an error neighborhood with respect to the Hellinger metric centered at the current posterior. For constant error radius, POG converges to a neighborhood of the population posterior (Theorem 1(ii))but with finite memory at-worst determined by the metric entropy of the feature space.
Score: 14.309243378538012
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaussian processes provide a framework for nonlinear nonparametric Bayesian inference widely applicable across science and engineering. Unfortunately, their computational burden scales cubically with the training sample size, which in the case that samples arrive in perpetuity, approaches infinity. This issue necessitates approximations for use with streaming data, which to date mostly lack convergence guarantees. Thus, we develop the first online Gaussian process approximation that preserves convergence to the population posterior, i.e., asymptotic posterior consistency, while ameliorating its intractable complexity growth with the sample size. We propose an online compression scheme that, following each a posteriori update, fixes an error neighborhood with respect to the Hellinger metric centered at the current posterior, and greedily tosses out past kernel dictionary elements until its boundary is hit. We call the resulting method Parsimonious Online Gaussian Processes (POG). For diminishing error radius, exact asymptotic consistency is preserved (Theorem 1(i)) at the cost of unbounded memory in the limit. On the other hand, for constant error radius, POG converges to a neighborhood of the population posterior (Theorem 1(ii))but with finite memory at-worst determined by the metric entropy of the feature space (Theorem 2). Experimental results are presented on several nonlinear regression problems which illuminates the merits of this approach as compared with alternatives that fix the subspace dimension defining the history of past points.

Related papers

Bayesian Circular Regression with von Mises Quasi-Processes [57.88921637944379]
In this work we explore a family of expressive and interpretable distributions over circle-valued random functions. For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Gibbs sampling. We present experiments applying this model to the prediction of wind directions and the percentage of the running gait cycle as a function of joint angles.
arXiv Detail & Related papers (2024-06-19T01:57:21Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Riemannian Laplace Approximation with the Fisher Metric [5.982697037000189]
Laplace's method approximates a target density with a Gaussian distribution at its mode. For complex targets and finite-data posteriors it is often too crude an approximation. We develop two alternative variants that are exact at the limit of infinite data.
arXiv Detail & Related papers (2023-11-05T20:51:03Z)
Implicit Manifold Gaussian Process Regression [49.0787777751317]
Gaussian process regression is widely used to provide well-calibrated uncertainty estimates. It struggles with high-dimensional data because of the implicit low-dimensional manifold upon which the data actually lies. In this paper we propose a technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way.
arXiv Detail & Related papers (2023-10-30T09:52:48Z)
Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate. To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z)
Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic Superlinear Convergence Rate [37.49160762239869]
We present a Limited-memory Greedy BFGS (LG-BFGS) method that can achieve an explicit non-asymptotic superlinear rate. Our established non-asymptotic superlinear convergence rate demonstrates an explicit trade-off between the convergence speed and memory requirement.
arXiv Detail & Related papers (2023-06-27T12:59:56Z)
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data. For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z)
Posterior and Computational Uncertainty in Gaussian Processes [52.26904059556759]
Gaussian processes scale prohibitively with the size of the dataset. Many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. We develop a new class of methods that provides consistent estimation of the combined uncertainty arising from both the finite number of data observed and the finite amount of computation expended.
arXiv Detail & Related papers (2022-05-30T22:16:25Z)
Laplace Matching for fast Approximate Inference in Generalized Linear Models [27.70274403550477]
We propose an approximate inference framework primarily designed to be computationally cheap while still achieving high approximation quality. The concept, which we call emphLaplace Matching, involves closed-form, approximate, bi-directional transformations between the parameter spaces of exponential families. This effectively turns inference in GLMs into conjugate inference (with small approximation errors)
arXiv Detail & Related papers (2021-05-07T08:25:17Z)
Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z)
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees. Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.