Related papers: Entry Dependent Expert Selection in Distributed Gaussian Processes Using Multilabel Classification

Entry Dependent Expert Selection in Distributed Gaussian Processes Using Multilabel Classification

URL: http://arxiv.org/abs/2211.09940v2
Date: Mon, 8 Jan 2024 15:33:20 GMT
Title: Entry Dependent Expert Selection in Distributed Gaussian Processes Using Multilabel Classification
Authors: Hamed Jalali and Gjergji Kasneci
Abstract summary: An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data. This paper proposes a flexible expert selection approach based on the characteristics of entry data points.
Score: 12.622412402489951
License: http://creativecommons.org/licenses/by/4.0/
Abstract: By distributing the training process, local approximation reduces the cost of the standard Gaussian Process. An ensemble technique combines local predictions from Gaussian experts trained on different partitions of the data. Ensemble methods aggregate models' predictions by assuming a perfect diversity of local predictors. Although it keeps the aggregation tractable, this assumption is often violated in practice. Even though ensemble methods provide consistent results by assuming dependencies between experts, they have a high computational cost, which is cubic in the number of experts involved. By implementing an expert selection strategy, the final aggregation step uses fewer experts and is more efficient. However, a selection approach that assigns a fixed set of experts to each new data point cannot encode the specific properties of each unique data point. This paper proposes a flexible expert selection approach based on the characteristics of entry data points. To this end, we investigate the selection task as a multi-label classification problem where the experts define labels, and each entry point is assigned to some experts. The proposed solution's prediction quality, efficiency, and asymptotic properties are discussed in detail. We demonstrate the efficacy of our method through extensive numerical experiments using synthetic and real-world data sets.

Related papers

Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection [63.96018203905272]
We propose to reduce the sampling cost by pruning a pretrained diffusion model into a mixture of efficient experts. We demonstrate the effectiveness of our method, DiffPruning, across several datasets.
arXiv Detail & Related papers (2024-09-23T21:27:26Z)
A naive aggregation algorithm for improving generalization in a class of learning problems [0.0]
We present a naive aggregation algorithm for a typical learning problem with expert advice setting. In particular, we consider a class of learning problem of point estimations for modeling high-dimensional nonlinear functions.
arXiv Detail & Related papers (2024-09-06T15:34:17Z)
Generalization Error Analysis for Sparse Mixture-of-Experts: A Preliminary Study [65.11303133775857]
Mixture-of-Experts (MoE) computation amalgamates predictions from several specialized sub-models (referred to as experts) Sparse MoE selectively engages only a limited number, or even just one expert, significantly reducing overhead while empirically preserving, and sometimes even enhancing, performance.
arXiv Detail & Related papers (2024-03-26T05:48:02Z)
Revisiting Long-tailed Image Classification: Survey and Benchmarks with New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution. Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z)
Multi-View Knowledge Distillation from Crowd Annotations for Out-of-Domain Generalization [53.24606510691877]
We propose new methods for acquiring soft-labels from crowd-annotations by aggregating the distributions produced by existing methods. We demonstrate that these aggregation methods lead to the most consistent performance across four NLP tasks on out-of-domain test sets.
arXiv Detail & Related papers (2022-12-19T12:40:18Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
Correlated Product of Experts for Sparse Gaussian Process Regression [2.466065249430993]
We propose a new approach based on aggregating predictions from several local and correlated experts. Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases. We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximation methods.
arXiv Detail & Related papers (2021-12-17T14:14:08Z)
Gaussian Experts Selection using Graphical Models [7.530615321587948]
Local approximations reduce time complexity by dividing the original dataset into subsets and training a local expert on each subset. We leverage techniques from the literature on undirected graphical models, using sparse precision matrices that encode conditional dependencies between experts to select the most important experts.
arXiv Detail & Related papers (2021-02-02T14:12:11Z)
Aggregating Dependent Gaussian Experts in Local Approximation [8.4159776055506]
We propose a novel approach for aggregating the Gaussian experts by detecting strong violations of conditional independence. The dependency between experts is determined by using a Gaussian graphical model, which yields the precision matrix. Our new method outperforms other state-of-the-art (SOTA) DGP approaches while being substantially more time-efficient than SOTA approaches.
arXiv Detail & Related papers (2020-10-17T21:49:43Z)
A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains. We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z)
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks. We present a unifying view of randomized smoothing over arbitrary functions. We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
Outlier Detection Ensemble with Embedded Feature Selection [42.8338013000469]
We propose an outlier detection ensemble framework with embedded feature selection (ODEFS) For each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation. We adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection.
arXiv Detail & Related papers (2020-01-15T13:14:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.