Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows
- URL: http://arxiv.org/abs/2205.07918v1
- Date: Mon, 16 May 2022 18:03:41 GMT
- Title: Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows
- Authors: Feynman Liang, Liam Hodgkinson, Michael W. Mahoney
- Abstract summary: Fat-tailed densities commonly arise as posterior and marginal distributions in robust models and scale mixtures.
We first improve previous theory on tails of Lipschitz flows by quantifying how tails affect the rate of tail decay.
We then develop an alternative theory for tail parameters which is sensitive to tail-anisotropy.
- Score: 53.32246823168763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While fat-tailed densities commonly arise as posterior and marginal
distributions in robust models and scale mixtures, they present challenges when
Gaussian-based variational inference fails to capture tail decay accurately. We
first improve previous theory on tails of Lipschitz flows by quantifying how
the tails affect the rate of tail decay and by expanding the theory to
non-Lipschitz polynomial flows. Then, we develop an alternative theory for
multivariate tail parameters which is sensitive to tail-anisotropy. In doing
so, we unveil a fundamental problem which plagues many existing flow-based
methods: they can only model tail-isotropic distributions (i.e., distributions
having the same tail parameter in every direction). To mitigate this and enable
modeling of tail-anisotropic targets, we propose anisotropic tail-adaptive
flows (ATAF). Experimental results on both synthetic and real-world targets
confirm that ATAF is competitive with prior work while also exhibiting
appropriate tail-anisotropy.
Related papers
- Flexible Tails for Normalizing Flows [0.658372523529902]
A popular current solution to this problem is to use a heavy tailed base distribution.
We argue this can lead to poor performance due to the difficulty of optimising neural networks, such as normalizing flows, under heavy tailed input.
We propose an alternative: use a Gaussian base distribution and a final transformation layer which can produce heavy tails.
arXiv Detail & Related papers (2024-06-22T13:44:01Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Geometric Prior Guided Feature Representation Learning for Long-Tailed Classification [47.09355487357069]
We propose to leverage the geometric information of the feature distribution of the well-represented head class to guide the model to learn the underlying distribution of the tail class.
It aims to make the perturbed features cover the underlying distribution of the tail class as much as possible, thus improving the model's generalization performance in the test domain.
arXiv Detail & Related papers (2024-01-21T09:16:29Z) - A Heavy-Tailed Algebra for Probabilistic Programming [53.32246823168763]
We propose a systematic approach for analyzing the tails of random variables.
We show how this approach can be used during the static analysis (before drawing samples) pass of a probabilistic programming language compiler.
Our empirical results confirm that inference algorithms that leverage our heavy-tailed algebra attain superior performance across a number of density modeling and variational inference tasks.
arXiv Detail & Related papers (2023-06-15T16:37:36Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - Marginal Tail-Adaptive Normalizing Flows [15.732950126814089]
This paper focuses on improving the ability of normalizing flows to correctly capture the tail behavior.
We prove that the marginal tailedness of an autoregressive flow can be controlled via the tailedness of the marginals of its base distribution.
An empirical analysis shows that the proposed method improves on the accuracy -- especially on the tails of the distribution -- and is able to generate heavy-tailed data.
arXiv Detail & Related papers (2022-06-21T12:34:36Z) - Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on
Least Squares [12.2950446921662]
Recent studies have shown that heavy tails can emerge in optimization and that the heaviness of the tails has links to the generalization error.
We establish novel links between the tail behavior and generalization properties of gradient descent (SGD) through the lens of algorithmic stability.
We support our theory with synthetic and real neural network experiments.
arXiv Detail & Related papers (2022-06-02T19:59:48Z) - Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed
Classification [90.17537630880305]
We address the overlooked unbiasedness in existing long-tailed classification methods.
We propose Cross-Domain Empirical Risk Minimization (xERM) for training an unbiased model.
arXiv Detail & Related papers (2021-12-29T03:18:47Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.