Conditional Deep Gaussian Processes: empirical Bayes hyperdata learning
- URL: http://arxiv.org/abs/2110.00568v1
- Date: Fri, 1 Oct 2021 17:50:48 GMT
- Title: Conditional Deep Gaussian Processes: empirical Bayes hyperdata learning
- Authors: Chi-Ken Lu and Patrick Shafto
- Abstract summary: We propose a conditional Deep Gaussian Process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata.
We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space.
Preliminary extrapolation results demonstrate expressive power of the proposed model compared with GP kernel composition, DGP variational inference, and deep kernel learning.
- Score: 6.599344783327054
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is desirable to combine the expressive power of deep learning with
Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel
learning proposed in [1] showed success in adopting a deep network for feature
extraction followed by a GP used as function model. Recently, [2] suggested
that the deterministic nature of feature extractor may lead to overfitting
while the replacement with a Bayesian network seemed to cure it. Here, we
propose the conditional Deep Gaussian Process (DGP) in which the intermediate
GPs in hierarchical composition are supported by the hyperdata and the exposed
GP remains zero mean. Motivated by the inducing points in sparse GP, the
hyperdata also play the role of function supports, but are hyperparameters
rather than random variables. We use the moment matching method [3] to
approximate the marginal prior for conditional DGP with a GP carrying an
effective kernel. Thus, as in empirical Bayes, the hyperdata are learned by
optimizing the approximate marginal likelihood which implicitly depends on the
hyperdata via the kernel. We shall show the equivalence with the deep kernel
learning in the limit of dense hyperdata in latent space. However, the
conditional DGP and the corresponding approximate inference enjoy the benefit
of being more Bayesian than deep kernel learning. Preliminary extrapolation
results demonstrate expressive power of the proposed model compared with GP
kernel composition, DGP variational inference, and deep kernel learning. We
also address the non-Gaussian aspect of our model as well as way of upgrading
to a full Bayes inference.
Related papers
- Thin and Deep Gaussian Processes [43.22976185646409]
This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP)
We show with theoretical and experimental results that i) TDGP is tailored to specifically discover lower-dimensional manifold in the input data, ii) TDGP behaves well when increasing the number of layers, and iv) TDGP performs well in standard benchmark datasets.
arXiv Detail & Related papers (2023-10-17T18:50:24Z) - Linear Time GPs for Inferring Latent Trajectories from Neural Spike
Trains [7.936841911281107]
We propose cvHM, a general inference framework for latent GP models leveraging Hida-Mat'ern kernels and conjugate variational inference (CVI)
We are able to perform variational inference of latent neural trajectories with linear time complexity for arbitrary likelihoods.
arXiv Detail & Related papers (2023-06-01T16:31:36Z) - Interactive Segmentation as Gaussian Process Classification [58.44673380545409]
Click-based interactive segmentation (IS) aims to extract the target objects under user interaction.
Most of the current deep learning (DL)-based methods mainly follow the general pipelines of semantic segmentation.
We propose to formulate the IS task as a Gaussian process (GP)-based pixel-wise binary classification model on each image.
arXiv Detail & Related papers (2023-02-28T14:01:01Z) - Shallow and Deep Nonparametric Convolutions for Gaussian Processes [0.0]
We introduce a nonparametric process convolution formulation for GPs that alleviates weaknesses by using a functional sampling approach.
We propose a composition of these nonparametric convolutions that serves as an alternative to classic deep GP models.
arXiv Detail & Related papers (2022-06-17T19:03:04Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Robust and Adaptive Temporal-Difference Learning Using An Ensemble of
Gaussian Processes [70.80716221080118]
The paper takes a generative perspective on policy evaluation via temporal-difference (TD) learning.
The OS-GPTD approach is developed to estimate the value function for a given policy by observing a sequence of state-reward pairs.
To alleviate the limited expressiveness associated with a single fixed kernel, a weighted ensemble (E) of GP priors is employed to yield an alternative scheme.
arXiv Detail & Related papers (2021-12-01T23:15:09Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning [23.83961717568121]
GP-Tree is a novel method for multi-class classification with Gaussian processes and deep kernel learning.
We develop a tree-based hierarchical model in which each internal node fits a GP to the data.
Our method scales well with both the number of classes and data size.
arXiv Detail & Related papers (2021-02-15T22:16:27Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - Conditional Deep Gaussian Processes: multi-fidelity kernel learning [6.599344783327053]
We propose the conditional DGP model in which the latent GPs are directly supported by the fixed lower fidelity data.
Experiments with synthetic and high dimensional data show comparable performance against other multi-fidelity regression methods.
We conclude that, with the low fidelity data and the hierarchical DGP structure, the effective kernel encodes the inductive bias for true function.
arXiv Detail & Related papers (2020-02-07T14:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.