VIB is Half Bayes
- URL: http://arxiv.org/abs/2011.08711v1
- Date: Tue, 17 Nov 2020 15:36:35 GMT
- Title: VIB is Half Bayes
- Authors: Alexander A Alemi and Warren R Morningstar and Ben Poole and Ian
Fischer and Joshua V Dillon
- Abstract summary: We show that the Variational Information Bottleneck can be viewed as a compromise between fully empirical and fully Bayesian objectives.
We argue that this approach provides some of the benefits of Bayes while requiring only some of the work.
- Score: 80.3767111908235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In discriminative settings such as regression and classification there are
two random variables at play, the inputs X and the targets Y. Here, we
demonstrate that the Variational Information Bottleneck can be viewed as a
compromise between fully empirical and fully Bayesian objectives, attempting to
minimize the risks due to finite sampling of Y only. We argue that this
approach provides some of the benefits of Bayes while requiring only some of
the work.
Related papers
- Minimax Optimal Fair Classification with Bounded Demographic Disparity [28.936244976415484]
This paper explores the statistical foundations of fair binary classification with two protected groups.
We show that using a finite sample incurs additional costs due to the need to estimate group-specific acceptance thresholds.
We propose FairBayes-DDP+, a group-wise thresholding method with an offset that we show attains the minimax lower bound.
arXiv Detail & Related papers (2024-03-27T02:59:04Z) - Adaptive importance sampling for heavy-tailed distributions via
$\alpha$-divergence minimization [2.879807093604632]
We propose an AIS algorithm that approximates the target by Student-t proposal distributions.
We adapt location and scale parameters by matching the escort moments of the target and the proposal.
These updates minimize the $alpha$-divergence between the target and the proposal, thereby connecting with variational inference.
arXiv Detail & Related papers (2023-10-25T14:07:08Z) - Primal Dual Continual Learning: Balancing Stability and Plasticity through Adaptive Memory Allocation [86.8475564814154]
We show that it is both possible and beneficial to undertake the constrained optimization problem directly.
We focus on memory-based methods, where a small subset of samples from previous tasks can be stored in a replay buffer.
We show that dual variables indicate the sensitivity of the optimal value of the continual learning problem with respect to constraint perturbations.
arXiv Detail & Related papers (2023-09-29T21:23:27Z) - Optimal Representations for Covariate Shift [18.136705088756138]
We introduce a simple variational objective whose optima are exactly the set of all representations on which risk minimizers are guaranteed to be robust.
Our objectives achieve state-of-the-art results on DomainBed, and give insights into the robustness of recent methods, such as CLIP.
arXiv Detail & Related papers (2021-12-31T21:02:24Z) - Diverse, Global and Amortised Counterfactual Explanations for
Uncertainty Estimates [31.241489953967694]
We study the diversity of such sets and find that many CLUEs are redundant.
We then propose GLobal AMortised CLUE (GLAM-CLUE), a distinct and novel method which learns amortised mappings on specific groups of uncertain inputs.
Our experiments show that $delta$-CLUE, $nabla$-CLUE, and GLAM-CLUE all address shortcomings of CLUE and provide beneficial explanations of uncertainty estimates to practitioners.
arXiv Detail & Related papers (2021-12-05T18:27:21Z) - A Bayesian Framework for Information-Theoretic Probing [51.98576673620385]
We argue that probing should be seen as approximating a mutual information.
This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences.
This paper proposes a new framework to measure what we term Bayesian mutual information.
arXiv Detail & Related papers (2021-09-08T18:08:36Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - A Variational Inequality Approach to Bayesian Regression Games [90.79402153164587]
We prove the existence of the uniqueness of a class of convex and generalize it to smooth cost functions.
We provide two simple algorithms of solving them with necessarily strong convergence.
arXiv Detail & Related papers (2021-03-24T22:33:11Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.