Aggregating Dependent Gaussian Experts in Local Approximation
- URL: http://arxiv.org/abs/2010.08873v1
- Date: Sat, 17 Oct 2020 21:49:43 GMT
- Title: Aggregating Dependent Gaussian Experts in Local Approximation
- Authors: Hamed Jalali, Gjergji Kasneci
- Abstract summary: We propose a novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence.
The dependency between experts is determined by using a Gaussian graphical model, which yields the precision matrix.
Our new method outperforms other state-of-the-art (SOTA) DGP approaches while being substantially more time-efficient than SOTA approaches.
- Score: 8.4159776055506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distributed Gaussian processes (DGPs) are prominent local approximation
methods to scale Gaussian processes (GPs) to large datasets. Instead of a
global estimation, they train local experts by dividing the training set into
subsets, thus reducing the time complexity. This strategy is based on the
conditional independence assumption, which basically means that there is a
perfect diversity between the local experts. In practice, however, this
assumption is often violated, and the aggregation of experts leads to
sub-optimal and inconsistent solutions. In this paper, we propose a novel
approach for aggregating the Gaussian experts by detecting strong violations of
conditional independence. The dependency between experts is determined by using
a Gaussian graphical model, which yields the precision matrix. The precision
matrix encodes conditional dependencies between experts and is used to detect
strongly dependent experts and construct an improved aggregation. Using both
synthetic and real datasets, our experimental evaluations illustrate that our
new method outperforms other state-of-the-art (SOTA) DGP approaches while being
substantially more time-efficient than SOTA approaches, which build on
independent experts.
Related papers
- Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection [63.96018203905272]
We propose to reduce the sampling cost by pruning a pretrained diffusion model into a mixture of efficient experts.
We demonstrate the effectiveness of our method, DiffPruning, across several datasets.
arXiv Detail & Related papers (2024-09-23T21:27:26Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - On Least Square Estimation in Softmax Gating Mixture of Experts [78.3687645289918]
We investigate the performance of the least squares estimators (LSE) under a deterministic MoE model.
We establish a condition called strong identifiability to characterize the convergence behavior of various types of expert functions.
Our findings have important practical implications for expert selection.
arXiv Detail & Related papers (2024-02-05T12:31:18Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Entry Dependent Expert Selection in Distributed Gaussian Processes Using
Multilabel Classification [12.622412402489951]
An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data.
This paper proposes a flexible expert selection approach based on the characteristics of entry data points.
arXiv Detail & Related papers (2022-11-17T23:23:26Z) - Gaussian Graphical Models as an Ensemble Method for Distributed Gaussian
Processes [8.4159776055506]
We propose a novel approach for aggregating the Gaussian experts' predictions by Gaussian graphical model (GGM)
We first estimate the joint distribution of latent and observed variables using the Expectation-Maximization (EM) algorithm.
Our new method outperforms other state-of-the-art DGP approaches.
arXiv Detail & Related papers (2022-02-07T15:22:56Z) - Correlated Product of Experts for Sparse Gaussian Process Regression [2.466065249430993]
We propose a new approach based on aggregating predictions from several local and correlated experts.
Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases.
We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximation methods.
arXiv Detail & Related papers (2021-12-17T14:14:08Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Gaussian Experts Selection using Graphical Models [7.530615321587948]
Local approximations reduce time complexity by dividing the original dataset into subsets and training a local expert on each subset.
We leverage techniques from the literature on undirected graphical models, using sparse precision matrices that encode conditional dependencies between experts to select the most important experts.
arXiv Detail & Related papers (2021-02-02T14:12:11Z) - Fast Deep Mixtures of Gaussian Process Experts [0.6554326244334868]
Mixtures of experts have become an indispensable tool for flexible modelling in a supervised learning context.
In this article, we propose to design the gating network for selecting the experts from sparse GPs using a deep neural network (DNN)
A fast one pass algorithm called Cluster-Classify-Regress ( CCR) is leveraged to approximate the maximum a posteriori (MAP) estimator extremely quickly.
arXiv Detail & Related papers (2020-06-11T18:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.