The Power Spherical distribution
- URL: http://arxiv.org/abs/2006.04437v2
- Date: Mon, 15 Jun 2020 10:42:27 GMT
- Title: The Power Spherical distribution
- Authors: Nicola De Cao, Wilker Aziz
- Abstract summary: Power Spherical distribution retains important aspects of the von Mises-Fisher (vMF) distribution while addressing its main drawbacks.
We demonstrate the stability of Power Spherical distributions with a numerical experiment and further apply it to a variational auto-encoder trained on MNIST.
- Score: 27.20633592977323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a growing interest in probabilistic models defined in
hyper-spherical spaces, be it to accommodate observed data or latent structure.
The von Mises-Fisher (vMF) distribution, often regarded as the Normal
distribution on the hyper-sphere, is a standard modeling choice: it is an
exponential family and thus enjoys important statistical results, for example,
known Kullback-Leibler (KL) divergence from other vMF distributions. Sampling
from a vMF distribution, however, requires a rejection sampling procedure which
besides being slow poses difficulties in the context of stochastic
backpropagation via the reparameterization trick. Moreover, this procedure is
numerically unstable for certain vMFs, e.g., those with high concentration
and/or in high dimensions. We propose a novel distribution, the Power Spherical
distribution, which retains some of the important aspects of the vMF (e.g.,
support on the hyper-sphere, symmetry about its mean direction parameter, known
KL from other vMF distributions) while addressing its main drawbacks (i.e.,
scalability and numerical stability). We demonstrate the stability of Power
Spherical distributions with a numerical experiment and further apply it to a
variational auto-encoder trained on MNIST. Code at:
https://github.com/nicola-decao/power_spherical
Related papers
- Kibble-Zurek Mechanism and Beyond: Lessons from a Holographic Superfluid Disk [0.0]
Superfluid phase transition dynamics is studied in the framework of the Einstein-Abelian-Higgs model in an $AdS_4$ black hole.
For a slow quench, the vortex density admits a universal scaling law with the cooling rate as predicted by the Kibble-Zurek mechanism (KZM)
For fast quenches, the density shows a universal scaling behavior as a function of the final temperature, that lies beyond the KZM prediction.
arXiv Detail & Related papers (2024-06-07T09:45:37Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate [49.97755400231656]
We establish convergence guarantees for substantially larger classes of distributions under DT diffusion processes.
We then specialize our results to a number of interesting classes of distributions with explicit parameter dependencies.
We propose a novel accelerated sampler and show that it improves the convergence rates of the corresponding regular sampler by orders of magnitude with respect to all system parameters.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Quantifying quantum chaos through microcanonical distributions of
entanglement [0.0]
A characteristic feature of "quantum chaotic" systems is that their eigenspectra and eigenstates display universal statistical properties described by random matrix theory (RMT)
We introduce a quantitative metric for quantum chaos which utilizes the Kullback-Leibler divergence to compare the microcanonical distribution of entanglement entropy (EE) of midspectrum eigenstates with a reference RMT distribution generated by pure random states (with appropriate constraints)
We study this metric in local minimally structured Floquet random circuits, as well as a canonical family of many-body Hamiltonians, the mixed field Ising model (MFIM
arXiv Detail & Related papers (2023-05-19T18:00:05Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - Flexible Amortized Variational Inference in qBOLD MRI [56.4324135502282]
Oxygen extraction fraction (OEF) and deoxygenated blood volume (DBV) are more ambiguously determined from the data.
Existing inference methods tend to yield very noisy and underestimated OEF maps, while overestimating DBV.
This work describes a novel probabilistic machine learning approach that can infer plausible distributions of OEF and DBV.
arXiv Detail & Related papers (2022-03-11T10:47:16Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - AI Giving Back to Statistics? Discovery of the Coordinate System of
Univariate Distributions by Beta Variational Autoencoder [0.0]
The article discusses experiences of training neural networks to classify univariate empirical distributions and to represent them on the two-dimensional latent space forcing disentanglement based on the inputs of cumulative distribution functions (CDF)
The representation on the latent two-dimensional coordinate system can be seen as an additional metadata of the real-world data that disentangles important distribution characteristics, such as shape of the CDF, classification probabilities of underlying theoretical distributions and their parameters, information entropy, and skewness.
arXiv Detail & Related papers (2020-04-06T14:11:13Z) - Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging.
We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence.
Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.