Bayesian Active Learning with Fully Bayesian Gaussian Processes
- URL: http://arxiv.org/abs/2205.10186v1
- Date: Fri, 20 May 2022 13:52:04 GMT
- Title: Bayesian Active Learning with Fully Bayesian Gaussian Processes
- Authors: Christoffer Riis, Francisco N. Antunes, Frederik Boe H\"uttel, Carlos
Lima Azevedo, Francisco Camara Pereira
- Abstract summary: In active learning, where labeled data is scarce or difficult to obtain, neglecting this trade-off can cause inefficient querying.
We show that incorporating the bias-variance trade-off in the acquisition functions mitigates unnecessary and expensive data labeling.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The bias-variance trade-off is a well-known problem in machine learning that
only gets more pronounced the less available data there is. In active learning,
where labeled data is scarce or difficult to obtain, neglecting this trade-off
can cause inefficient and non-optimal querying, leading to unnecessary data
labeling. In this paper, we focus on active learning with Gaussian Processes
(GPs). For the GP, the bias-variance trade-off is made by optimization of the
two hyperparameters: the length scale and noise-term. Considering that the
optimal mode of the joint posterior of the hyperparameters is equivalent to the
optimal bias-variance trade-off, we approximate this joint posterior and
utilize it to design two new acquisition functions. The first one is a Bayesian
variant of Query-by-Committee (B-QBC), and the second is an extension that
explicitly minimizes the predictive variance through a Query by Mixture of
Gaussian Processes (QB-MGP) formulation. Across six common simulators, we
empirically show that B-QBC, on average, achieves the best marginal likelihood,
whereas QB-MGP achieves the best predictive performance. We show that
incorporating the bias-variance trade-off in the acquisition functions
mitigates unnecessary and expensive data labeling.
Related papers
- BI-EqNO: Generalized Approximate Bayesian Inference with an Equivariant Neural Operator Framework [9.408644291433752]
We introduce BI-EqNO, an equivariant neural operator framework for generalized approximate Bayesian inference.
BI-EqNO transforms priors into posteriors on conditioned observation data through data-driven training.
We demonstrate BI-EqNO's utility through two examples: (1) as a generalized Gaussian process (gGP) for regression, and (2) as an ensemble neural filter (EnNF) for sequential data assimilation.
arXiv Detail & Related papers (2024-10-21T18:39:16Z) - Towards Practical Preferential Bayesian Optimization with Skew Gaussian
Processes [8.198195852439946]
We study preferential Bayesian optimization (BO) where reliable feedback is limited to pairwise comparison called duels.
An important challenge in preferential BO, which uses the preferential Gaussian process (GP) model to represent flexible preference structure, is that the posterior distribution is a computationally intractable skew GP.
We develop a new method that achieves both high computational efficiency and low sample complexity, and then demonstrate its effectiveness through extensive numerical experiments.
arXiv Detail & Related papers (2023-02-03T03:02:38Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Fantasizing with Dual GPs in Bayesian Optimization and Active Learning [14.050425158209826]
We focus on fantasizing' batch acquisition functions that need the ability to condition on new fantasized data.
By using a sparse Dual GP parameterization, we gain linear scaling with batch size as well as one-step updates for non-Gaussian likelihoods.
arXiv Detail & Related papers (2022-11-02T11:37:06Z) - Scalable Bayesian Transformed Gaussian Processes [10.33253403416662]
The Bayesian transformed Gaussian process (BTG) model is a fully Bayesian counterpart to the warped Gaussian process (WGP)
We propose principled and fast techniques for computing with BTG.
Our framework uses doubly sparse quadrature rules, tight quantile bounds, and rank-one matrix algebra to enable both fast model prediction and model selection.
arXiv Detail & Related papers (2022-10-20T02:45:10Z) - Sample-Efficient Optimisation with Probabilistic Transformer Surrogates [66.98962321504085]
This paper investigates the feasibility of employing state-of-the-art probabilistic transformers in Bayesian optimisation.
We observe two drawbacks stemming from their training procedure and loss definition, hindering their direct deployment as proxies in black-box optimisation.
We introduce two components: 1) a BO-tailored training prior supporting non-uniformly distributed points, and 2) a novel approximate posterior regulariser trading-off accuracy and input sensitivity to filter favourable stationary points for improved predictive performance.
arXiv Detail & Related papers (2022-05-27T11:13:17Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - Approximate Inference for Fully Bayesian Gaussian Process Regression [11.47317712333228]
Learning in Gaussian Process models occurs through the adaptation of hyper parameters of the mean and the covariance function.
An alternative learning procedure is to infer the posterior over hyper parameters in a hierarchical specification of GPs we call textitFully Bayesian Gaussian Process Regression (GPR)
We analyze the predictive performance for fully Bayesian GPR on a range of benchmark data sets.
arXiv Detail & Related papers (2019-12-31T17:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.