On the Normalizing Constant of the Continuous Categorical Distribution
- URL: http://arxiv.org/abs/2204.13290v1
- Date: Thu, 28 Apr 2022 05:06:12 GMT
- Title: On the Normalizing Constant of the Continuous Categorical Distribution
- Authors: Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, Andres Potapczynski,
John P. Cunningham
- Abstract summary: A novel family of such distributions has been discovered: the continuous categorical.
In spite of this mathematical simplicity, our understanding of the normalizing constant remains far from complete.
We present theoretical and methodological advances that can, in turn, help to enable broader applications of the continuous categorical distribution.
- Score: 24.015934908123928
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Probability distributions supported on the simplex enjoy a wide range of
applications across statistics and machine learning. Recently, a novel family
of such distributions has been discovered: the continuous categorical. This
family enjoys remarkable mathematical simplicity; its density function
resembles that of the Dirichlet distribution, but with a normalizing constant
that can be written in closed form using elementary functions only. In spite of
this mathematical simplicity, our understanding of the normalizing constant
remains far from complete. In this work, we characterize the numerical behavior
of the normalizing constant and we present theoretical and methodological
advances that can, in turn, help to enable broader applications of the
continuous categorical distribution. Our code is available at
https://github.com/cunningham-lab/cb_and_cc/.
Related papers
- Non-asymptotic approximations for Pearson's chi-square statistic and its
application to confidence intervals for strictly convex functions of the
probability weights of discrete distributions [0.0]
We develop a non-asymptotic local normal approximation for multinomial probabilities.
We apply our results to find confidence intervals for the negative entropy of discrete distributions.
arXiv Detail & Related papers (2023-09-05T01:18:48Z) - A Heavy-Tailed Algebra for Probabilistic Programming [53.32246823168763]
We propose a systematic approach for analyzing the tails of random variables.
We show how this approach can be used during the static analysis (before drawing samples) pass of a probabilistic programming language compiler.
Our empirical results confirm that inference algorithms that leverage our heavy-tailed algebra attain superior performance across a number of density modeling and variational inference tasks.
arXiv Detail & Related papers (2023-06-15T16:37:36Z) - $\omega$PAP Spaces: Reasoning Denotationally About Higher-Order,
Recursive Probabilistic and Differentiable Programs [64.25762042361839]
$omega$PAP spaces are spaces for reasoning denotationally about expressive differentiable and probabilistic programming languages.
Our semantics is general enough to assign meanings to most practical probabilistic and differentiable programs.
We establish the almost-everywhere differentiability of probabilistic programs' trace density functions.
arXiv Detail & Related papers (2023-02-21T12:50:05Z) - Evidential Softmax for Sparse Multimodal Distributions in Deep
Generative Models [38.26333732364642]
We present $textitev-softmax$, a sparse normalization function that preserves the multimodality of probability distributions.
We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive architectures.
arXiv Detail & Related papers (2021-10-27T05:32:25Z) - Integrable Nonparametric Flows [5.9774834479750805]
We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a probability distribution.
This reverses the conventional task of normalizing flows.
We discuss potential applications to problems in quantum Monte Carlo and machine learning.
arXiv Detail & Related papers (2020-12-03T16:19:52Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Contextuality scenarios arising from networks of stochastic processes [68.8204255655161]
An empirical model is said contextual if its distributions cannot be obtained marginalizing a joint distribution over X.
We present a different and classical source of contextual empirical models: the interaction among many processes.
The statistical behavior of the network in the long run makes the empirical model generically contextual and even strongly contextual.
arXiv Detail & Related papers (2020-06-22T16:57:52Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z) - Generalized Sliced Distances for Probability Distributions [47.543990188697734]
We introduce a broad family of probability metrics, coined as Generalized Sliced Probability Metrics (GSPMs)
GSPMs are rooted in the generalized Radon transform and come with a unique geometric interpretation.
We consider GSPM-based gradient flows for generative modeling applications and show that under mild assumptions, the gradient flow converges to the global optimum.
arXiv Detail & Related papers (2020-02-28T04:18:00Z) - The continuous categorical: a novel simplex-valued exponential family [23.983555024375306]
We show that standard choices for simplex-valued data suffer from a number of limitations, including bias and numerical issues.
We resolve these limitations by introducing a novel exponential family of distributions for modeling simplex-valued data.
Unlike the Dirichlet and other typical choices, the continuous categorical results in a well-behaved probabilistic loss function.
arXiv Detail & Related papers (2020-02-20T04:28:02Z) - Generalized mean shift with triangular kernel profile [5.381004207943597]
Mean Shift algorithm is a popular way to find modes of some probability density functions taking a specific kernel-based shape.
We show that a novel Mean Shift variant adapted to them can be derived, and proved to converge after a finite number of iterations.
arXiv Detail & Related papers (2020-01-07T16:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.