How Tempering Fixes Data Augmentation in Bayesian Neural Networks
- URL: http://arxiv.org/abs/2205.13900v1
- Date: Fri, 27 May 2022 11:06:56 GMT
- Title: How Tempering Fixes Data Augmentation in Bayesian Neural Networks
- Authors: Gregor Bachmann, Lorenzo Noci, Thomas Hofmann
- Abstract summary: We show that tempering implicitly reduces the misspecification arising from modeling augmentations as i.i.d. data.
The temperature mimics the role of the effective sample size, reflecting the gain in information provided by the augmentations.
- Score: 22.188535244056016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Bayesian neural networks (BNNs) provide a sound and principled
alternative to standard neural networks, an artificial sharpening of the
posterior usually needs to be applied to reach comparable performance. This is
in stark contrast to theory, dictating that given an adequate prior and a
well-specified model, the untempered Bayesian posterior should achieve optimal
performance. Despite the community's extensive efforts, the observed gains in
performance still remain disputed with several plausible causes pointing at its
origin. While data augmentation has been empirically recognized as one of the
main drivers of this effect, a theoretical account of its role, on the other
hand, is largely missing. In this work we identify two interlaced factors
concurrently influencing the strength of the cold posterior effect, namely the
correlated nature of augmentations and the degree of invariance of the employed
model to such transformations. By theoretically analyzing simplified settings,
we prove that tempering implicitly reduces the misspecification arising from
modeling augmentations as i.i.d. data. The temperature mimics the role of the
effective sample size, reflecting the gain in information provided by the
augmentations. We corroborate our theoretical findings with extensive empirical
evaluations, scaling to realistic BNNs. By relying on the framework of group
convolutions, we experiment with models of varying inherent degree of
invariance, confirming its hypothesized relationship with the optimal
temperature.
Related papers
- Counterfactual Generative Modeling with Variational Causal Inference [1.9287470458589586]
We present a novel variational Bayesian causal inference framework to handle counterfactual generative modeling tasks.
In experiments, we demonstrate the advantage of our framework compared to state-of-the-art models in counterfactual generative modeling.
arXiv Detail & Related papers (2024-10-16T16:44:12Z) - C-XGBoost: A tree boosting model for causal effect estimation [8.246161706153805]
Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data.
We propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes.
arXiv Detail & Related papers (2024-03-31T17:43:37Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures [93.17009514112702]
Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
arXiv Detail & Related papers (2023-04-25T07:42:06Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - Towards Principled Causal Effect Estimation by Deep Identifiable Models [21.33872753593482]
We discuss the estimation of treatment effects (TEs) under unobserved confounding.
We propose Intact-VAE, a new variant of variational autoencoder (VAE), motivated by the prognostic score that is sufficient for identifying TEs.
arXiv Detail & Related papers (2021-09-30T12:19:45Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Sparse Bayesian Causal Forests for Heterogeneous Treatment Effects
Estimation [0.0]
This paper develops a sparsity-inducing version of Bayesian Causal Forests.
It is designed to estimate heterogeneous treatment effects using observational data.
arXiv Detail & Related papers (2021-02-12T15:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.