Correlated Product of Experts for Sparse Gaussian Process Regression
- URL: http://arxiv.org/abs/2112.09519v1
- Date: Fri, 17 Dec 2021 14:14:08 GMT
- Title: Correlated Product of Experts for Sparse Gaussian Process Regression
- Authors: Manuel Sch\"urch, Dario Azzimonti, Alessio Benavoli, Marco Zaffalon
- Abstract summary: We propose a new approach based on aggregating predictions from several local and correlated experts.
Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases.
We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximation methods.
- Score: 2.466065249430993
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Gaussian processes (GPs) are an important tool in machine learning and
statistics with applications ranging from social and natural science through
engineering. They constitute a powerful kernelized non-parametric method with
well-calibrated uncertainty estimates, however, off-the-shelf GP inference
procedures are limited to datasets with several thousand data points because of
their cubic computational complexity. For this reason, many sparse GPs
techniques have been developed over the past years. In this paper, we focus on
GP regression tasks and propose a new approach based on aggregating predictions
from several local and correlated experts. Thereby, the degree of correlation
between the experts can vary between independent up to fully correlated
experts. The individual predictions of the experts are aggregated taking into
account their correlation resulting in consistent uncertainty estimates. Our
method recovers independent Product of Experts, sparse GP and full GP in the
limiting cases. The presented framework can deal with a general kernel function
and multiple variables, and has a time and space complexity which is linear in
the number of experts and data samples, which makes our approach highly
scalable. We demonstrate superior performance, in a time vs. accuracy sense, of
our proposed method against state-of-the-art GP approximation methods for
synthetic as well as several real-world datasets with deterministic and
stochastic optimization.
Related papers
- Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data [2.8377382540923004]
We derive an alternative kernel that can discover and encode both sparsity and nonstationarity.
We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods.
We also conduct space-time prediction based on more than one million measurements of daily maximum temperature.
arXiv Detail & Related papers (2024-11-07T20:07:21Z) - Aggregation Models with Optimal Weights for Distributed Gaussian Processes [6.408773096179187]
We propose a novel approach for aggregated prediction in distributed GPs.
The proposed method incorporates correlations among experts, leading to better prediction accuracy with manageable computational requirements.
As demonstrated by empirical studies, the proposed approach results in more stable predictions in less time than state-of-the-art consistent aggregation models.
arXiv Detail & Related papers (2024-08-01T23:32:14Z) - Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study [65.11303133775857]
Mixture-of-Experts (MoE) computation amalgamates predictions from several specialized sub-models (referred to as experts)
Sparse MoE selectively engages only a limited number, or even just one expert, significantly reducing overhead while empirically preserving, and sometimes even enhancing, performance.
arXiv Detail & Related papers (2024-03-26T05:48:02Z) - Entry Dependent Expert Selection in Distributed Gaussian Processes Using
Multilabel Classification [12.622412402489951]
An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data.
This paper proposes a flexible expert selection approach based on the characteristics of entry data points.
arXiv Detail & Related papers (2022-11-17T23:23:26Z) - FaDIn: Fast Discretized Inference for Hawkes Processes with General
Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support.
The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG)
Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Gaussian Experts Selection using Graphical Models [7.530615321587948]
Local approximations reduce time complexity by dividing the original dataset into subsets and training a local expert on each subset.
We leverage techniques from the literature on undirected graphical models, using sparse precision matrices that encode conditional dependencies between experts to select the most important experts.
arXiv Detail & Related papers (2021-02-02T14:12:11Z) - Revisiting the Sample Complexity of Sparse Spectrum Approximation of
Gaussian Processes [60.479499225746295]
We introduce a new scalable approximation for Gaussian processes with provable guarantees which hold simultaneously over its entire parameter space.
Our approximation is obtained from an improved sample complexity analysis for sparse spectrum Gaussian processes (SSGPs)
arXiv Detail & Related papers (2020-11-17T05:41:50Z) - Aggregating Dependent Gaussian Experts in Local Approximation [8.4159776055506]
We propose a novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence.
The dependency between experts is determined by using a Gaussian graphical model, which yields the precision matrix.
Our new method outperforms other state-of-the-art (SOTA) DGP approaches while being substantially more time-efficient than SOTA approaches.
arXiv Detail & Related papers (2020-10-17T21:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.