Greedy Bayesian Posterior Approximation with Deep Ensembles
- URL: http://arxiv.org/abs/2105.14275v2
- Date: Tue, 1 Jun 2021 07:29:42 GMT
- Title: Greedy Bayesian Posterior Approximation with Deep Ensembles
- Authors: Aleksei Tiulpin and Matthew B. Blaschko
- Abstract summary: Ensembles of independently trained objective are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning.
We show that our method is submodular with respect to the mixture of components for any problem in a function space.
- Score: 22.466176036646814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensembles of independently trained neural networks are a state-of-the-art
approach to estimate predictive uncertainty in Deep Learning, and can be
interpreted as an approximation of the posterior distribution via a mixture of
delta functions. The training of ensembles relies on non-convexity of the loss
landscape and random initialization of their individual members, making the
resulting posterior approximation uncontrolled. This paper proposes a novel and
principled method to tackle this limitation, minimizing an $f$-divergence
between the true posterior and a kernel density estimator in a function space.
We analyze this objective from a combinatorial point of view, and show that it
is submodular with respect to mixture components for any $f$. Subsequently, we
consider the problem of greedy ensemble construction, and from the marginal
gain of the total objective, we derive a novel diversity term for ensemble
methods. The performance of our approach is demonstrated on computer vision
out-of-distribution benchmarks in a range of architectures trained on multiple
datasets. The source code of our method is publicly available at
https://github.com/MIPT-Oulu/greedy_ensembles_training.
Related papers
- Learning general Gaussian mixtures with efficient score matching [16.06356123715737]
We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions.
We make no separation assumptions on the underlying mixture components.
We give an algorithm that draws $dmathrmpoly(k/varepsilon)$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler.
arXiv Detail & Related papers (2024-04-29T17:30:36Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - Diverse Projection Ensembles for Distributional Reinforcement Learning [6.754994171490016]
This work studies the combination of several different projections and representations in a distributional ensemble.
We derive an algorithm that uses ensemble disagreement, measured by the average $1$-Wasserstein distance, as a bonus for deep exploration.
arXiv Detail & Related papers (2023-06-12T13:59:48Z) - Federated Learning as Variational Inference: A Scalable Expectation
Propagation Approach [66.9033666087719]
This paper extends the inference view and describes a variational inference formulation of federated learning.
We apply FedEP on standard federated learning benchmarks and find that it outperforms strong baselines in terms of both convergence speed and accuracy.
arXiv Detail & Related papers (2023-02-08T17:58:11Z) - Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in
Deep Learning [24.3370326359959]
We propose to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep neural networks.
We theoretically validate that our approach mitigates overconfidence "far away" from the training data and empirically compare against state-of-the-art baselines on standard uncertainty quantification benchmarks.
arXiv Detail & Related papers (2021-11-05T15:52:48Z) - Deep Shells: Unsupervised Shape Correspondence with Optimal Transport [52.646396621449]
We propose a novel unsupervised learning approach to 3D shape correspondence.
We show that the proposed method significantly improves over the state-of-the-art on multiple datasets.
arXiv Detail & Related papers (2020-10-28T22:24:07Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z) - A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains.
We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.