Related papers: Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution

Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution

URL: http://arxiv.org/abs/2506.21278v2
Date: Mon, 14 Jul 2025 10:06:29 GMT
Title: Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution
Authors: Lukas Sablica, Kurt Hornik,
Abstract summary: We propose a novel variational autoencoder (VAE) architecture that employs a spherical Cauchy (spCauchy) latent distribution.<n>Unlike traditional Gaussian latent spaces or the widely used von Mises-Fisher (vMF) distribution, spCauchy provides a more natural hyperspherical representation of latent variables.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel variational autoencoder (VAE) architecture that employs a spherical Cauchy (spCauchy) latent distribution. Unlike traditional Gaussian latent spaces or the widely used von Mises-Fisher (vMF) distribution, spCauchy provides a more natural hyperspherical representation of latent variables, better capturing directional data while maintaining flexibility. Its heavy-tailed nature prevents over-regularization, ensuring efficient latent space utilization while offering a more expressive representation. Additionally, spCauchy circumvents the numerical instabilities inherent to vMF, which arise from computing normalization constants involving Bessel functions. Instead, it enables a fully differentiable and efficient reparameterization trick via M\"obius transformations, allowing for stable and scalable training. The KL divergence can be computed through a rapidly converging power series, eliminating concerns of underflow or overflow associated with evaluation of ratios of hypergeometric functions. These properties make spCauchy a compelling alternative for VAEs, offering both theoretical advantages and practical efficiency in high-dimensional generative modeling.

Related papers

Beyond Diagonal Covariance: Flexible Posterior VAEs via Free-Form Injective Flows [7.655073661577189]
Variational Autoencoders (VAEs) are powerful generative models widely used for learning interpretable latent spaces, quantifying uncertainty, and compressing data for downstream generative tasks.<n>We show that a regularized variant of the recently introduced Free-form Injective Flow (FIF) can be interpreted as a VAE featuring a highly flexible, implicitly defined posterior.
arXiv Detail & Related papers (2025-06-02T10:36:27Z)
Variance-Reducing Couplings for Random Features [57.73648780299374]
Random features (RFs) are a popular technique to scale up kernel methods in machine learning. We find couplings to improve RFs defined on both Euclidean and discrete input spaces. We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm.
arXiv Detail & Related papers (2024-05-26T12:25:09Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models. PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z)
Disentangling Generative Factors of Physical Fields Using Variational Autoencoders [0.0]
This work explores the use of variational autoencoders (VAEs) for non-linear dimension reduction. A disentangled decomposition is interpretable and can be transferred to a variety of tasks including generative modeling.
arXiv Detail & Related papers (2021-09-15T16:02:43Z)
The Power Spherical distribution [27.20633592977323]
Power Spherical distribution retains important aspects of the von Mises-Fisher (vMF) distribution while addressing its main drawbacks. We demonstrate the stability of Power Spherical distributions with a numerical experiment and further apply it to a variational auto-encoder trained on MNIST.
arXiv Detail & Related papers (2020-06-08T09:51:43Z)
Training Deep Energy-Based Models with f-Divergence Minimization [113.97274898282343]
Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging. We propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence. Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
arXiv Detail & Related papers (2020-03-06T23:11:13Z)
Gaussianization Flows [113.79542218282282]
We propose a new type of normalizing flow model that enables both efficient iteration of likelihoods and efficient inversion for sample generation. Because of this guaranteed expressivity, they can capture multimodal target distributions without compromising the efficiency of sample generation.
arXiv Detail & Related papers (2020-03-04T08:15:06Z)
Stochastic Normalizing Flows [52.92110730286403]
We introduce normalizing flows for maximum likelihood estimation and variational inference (VI) using differential equations (SDEs) Using the theory of rough paths, the underlying Brownian motion is treated as a latent variable and approximated, enabling efficient training of neural SDEs. These SDEs can be used for constructing efficient chains to sample from the underlying distribution of a given dataset.
arXiv Detail & Related papers (2020-02-21T20:47:55Z)
Misspecification-robust likelihood-free inference in high dimensions [13.934999364767918]
We introduce an extension of the popular Bayesian optimisation based approach to approximate discrepancy functions in a probabilistic manner.<n>Our approach achieves computational scalability for higher dimensional parameter spaces by using separate acquisition functions and discrepancies for each parameter.<n>The method successfully performs computationally efficient inference in a 100-dimensional space on canonical examples and compares favourably to existing modularised ABC methods.
arXiv Detail & Related papers (2020-02-21T16:06:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.