Conditional Deep Gaussian Processes: multi-fidelity kernel learning
- URL: http://arxiv.org/abs/2002.02826v3
- Date: Fri, 1 Oct 2021 18:03:07 GMT
- Title: Conditional Deep Gaussian Processes: multi-fidelity kernel learning
- Authors: Chi-Ken Lu, Patrick Shafto
- Abstract summary: We propose the conditional DGP model in which the latent GPs are directly supported by the fixed lower fidelity data.
Experiments with synthetic and high dimensional data show comparable performance against other multi-fidelity regression methods.
We conclude that, with the low fidelity data and the hierarchical DGP structure, the effective kernel encodes the inductive bias for true function.
- Score: 6.599344783327053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Gaussian Processes (DGPs) were proposed as an expressive Bayesian model
capable of a mathematically grounded estimation of uncertainty. The
expressivity of DPGs results from not only the compositional character but the
distribution propagation within the hierarchy. Recently, [1] pointed out that
the hierarchical structure of DGP well suited modeling the multi-fidelity
regression, in which one is provided sparse observations with high precision
and plenty of low fidelity observations. We propose the conditional DGP model
in which the latent GPs are directly supported by the fixed lower fidelity
data. Then the moment matching method in [2] is applied to approximate the
marginal prior of conditional DGP with a GP. The obtained effective kernels are
implicit functions of the lower-fidelity data, manifesting the expressivity
contributed by distribution propagation within the hierarchy. The
hyperparameters are learned via optimizing the approximate marginal likelihood.
Experiments with synthetic and high dimensional data show comparable
performance against other multi-fidelity regression methods, variational
inference, and multi-output GP. We conclude that, with the low fidelity data
and the hierarchical DGP structure, the effective kernel encodes the inductive
bias for true function allowing the compositional freedom discussed in [3,4].
Related papers
- Hyperboloid GPLVM for Discovering Continuous Hierarchies via Nonparametric Estimation [41.13597666007784]
Dimensionality reduction (DR) offers a useful representation of complex high-dimensional data.
Recent DR methods focus on hyperbolic geometry to derive a faithful low-dimensional representation of hierarchical data.
This paper presents hGP-LVMs to embed high-dimensional hierarchical data with implicit continuity via nonparametric estimation.
arXiv Detail & Related papers (2024-10-22T05:07:30Z) - Thin and Deep Gaussian Processes [43.22976185646409]
This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP)
We show with theoretical and experimental results that i) TDGP is tailored to specifically discover lower-dimensional manifold in the input data, ii) TDGP behaves well when increasing the number of layers, and iv) TDGP performs well in standard benchmark datasets.
arXiv Detail & Related papers (2023-10-17T18:50:24Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - Robust and Adaptive Temporal-Difference Learning Using An Ensemble of
Gaussian Processes [70.80716221080118]
The paper takes a generative perspective on policy evaluation via temporal-difference (TD) learning.
The OS-GPTD approach is developed to estimate the value function for a given policy by observing a sequence of state-reward pairs.
To alleviate the limited expressiveness associated with a single fixed kernel, a weighted ensemble (E) of GP priors is employed to yield an alternative scheme.
arXiv Detail & Related papers (2021-12-01T23:15:09Z) - Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent:
Convergence Guarantees and Empirical Benefits [21.353189917487512]
gradient descent (SGD) and its variants have established themselves as the go-to algorithms for machine learning problems.
We take a step forward by proving minibatch SGD converges to a critical point of the full log-likelihood loss function.
Our theoretical guarantees hold provided that the kernel functions exhibit exponential or eigendecay.
arXiv Detail & Related papers (2021-11-19T22:28:47Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - Conditional Deep Gaussian Processes: empirical Bayes hyperdata learning [6.599344783327054]
We propose a conditional Deep Gaussian Process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata.
We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space.
Preliminary extrapolation results demonstrate expressive power of the proposed model compared with GP kernel composition, DGP variational inference, and deep kernel learning.
arXiv Detail & Related papers (2021-10-01T17:50:48Z) - Convolutional Normalizing Flows for Deep Gaussian Processes [40.10797051603641]
This paper introduces a new approach for specifying flexible, arbitrarily complex, and scalable approximate posterior distributions.
A novel convolutional normalizing flow (CNF) is developed to improve the time efficiency and capture dependency between layers.
Empirical evaluation demonstrates that CNF DGP outperforms the state-of-the-art approximation methods for DGPs.
arXiv Detail & Related papers (2021-04-17T07:25:25Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.