Related papers: Deep Sigma Point Processes

Deep Sigma Point Processes

URL: http://arxiv.org/abs/2002.09112v2
Date: Sat, 26 Dec 2020 17:27:19 GMT
Title: Deep Sigma Point Processes
Authors: Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner
Abstract summary: We introduce a class of parametric models inspired by the compositional structure of Deep Gaussian Processes (DGPs) Deep Sigma Point Processes (DSPPs) retain many of the attractive features of (variational) DGPs, including mini-batch training and predictive uncertainty that is controlled by kernel basis functions.
Score: 22.5396672566053
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce Deep Sigma Point Processes, a class of parametric models inspired by the compositional structure of Deep Gaussian Processes (DGPs). Deep Sigma Point Processes (DSPPs) retain many of the attractive features of (variational) DGPs, including mini-batch training and predictive uncertainty that is controlled by kernel basis functions. Importantly, since DSPPs admit a simple maximum likelihood inference procedure, the resulting predictive distributions are not degraded by any posterior approximations. In an extensive empirical comparison on univariate and multivariate regression tasks we find that the resulting predictive distributions are significantly better calibrated than those obtained with other probabilistic methods for scalable regression, including variational DGPs--often by as much as a nat per datapoint.

Related papers

Evaluating Uncertainty in Deep Gaussian Processes [0.0]
Deep Gaussian Processes (DGPs) and Deep Sigma Point Processes (DSPPs) extend GPs hierarchically, offering promising methods for uncertainty grounded in Bayesian principles. This work evaluates these models on regression (CASP dataset) and classification (ESR dataset) tasks, assessing predictive performance (MAE, Accu- racy), calibration using Negative Log-Likelihood (NLL) and Error (ECE) Compared to Deep Ensembles, which demonstrated superior robustness in both per-formance and calibration under the tested shifts, the GP-based methods showed vulnerabilities, exhibiting particular sensitivity in the observed metrics.
arXiv Detail & Related papers (2025-04-24T16:31:55Z)
Tighter sparse variational Gaussian processes [22.290236192353316]
Sparse variational Gaussian process (GP) approximations have become the de facto standard for scaling GPs to large datasets. This paper introduces a provably tighter variational approximation by relaxing the standard assumption that the conditional approximate posterior given the inducing points must match that in the prior.
arXiv Detail & Related papers (2025-02-07T08:33:28Z)
On the Convergence of DP-SGD with Adaptive Clipping [56.24689348875711]
Gradient Descent with gradient clipping is a powerful technique for enabling differentially private optimization. This paper provides the first comprehensive convergence analysis of SGD with quantile clipping (QC-SGD) We show how QC-SGD suffers from a bias problem similar to constant-threshold clipped SGD but can be mitigated through a carefully designed quantile and step size schedule.
arXiv Detail & Related papers (2024-12-27T20:29:47Z)
Amortized Variational Inference for Deep Gaussian Processes [0.0]
Deep Gaussian processes (DGPs) are multilayer generalizations of Gaussian processes (GPs) We introduce amortized variational inference for DGPs, which learns an inference function that maps each observation to variational parameters. Our method performs similarly or better than previous approaches at less computational cost.
arXiv Detail & Related papers (2024-09-18T20:23:27Z)
Deep Horseshoe Gaussian Processes [1.0742675209112622]
We introduce the deep Horseshoe Gaussian process Deep-HGP, a new simple prior based on deep Gaussian processes with a squared-exponential kernel. We show that the associated tempered posterior distribution recovers the unknown true regression curve optimally in terms of quadratic loss, up to a logarithmic factor.
arXiv Detail & Related papers (2024-03-04T05:30:43Z)
Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes [23.87733307119697]
We introduce Neural Operator Variational Inference (NOVI) for Deep Gaussian Processes. NOVI uses a neural generator to obtain a sampler and minimizes the Regularized Stein Discrepancy in L2 space between the generated distribution and true posterior. We demonstrate that the bias introduced by our method can be controlled by multiplying the divergence with a constant, which leads to robust error control and ensures the stability and precision of the algorithm.
arXiv Detail & Related papers (2023-09-22T06:56:35Z)
Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly. A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks. We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z)
Deep Variational Implicit Processes [0.0]
Implicit processes (IPs) are a generalization of Gaussian processes (GPs) We propose a multi-layer generalization of IPs called the Deep Variational Implicit process (DVIP)
arXiv Detail & Related papers (2022-06-14T10:04:41Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits [21.353189917487512]
gradient descent (SGD) and its variants have established themselves as the go-to algorithms for machine learning problems. We take a step forward by proving minibatch SGD converges to a critical point of the full log-likelihood loss function. Our theoretical guarantees hold provided that the kernel functions exhibit exponential or eigendecay.
arXiv Detail & Related papers (2021-11-19T22:28:47Z)
Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them. NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z)
Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary. With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions. The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes [55.62520135103578]
We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues. We show that our fix can lead to consistent improvements in the predictive performance of DGP models.
arXiv Detail & Related papers (2020-11-01T14:38:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.