Evidential Softmax for Sparse Multimodal Distributions in Deep
Generative Models
- URL: http://arxiv.org/abs/2110.14182v1
- Date: Wed, 27 Oct 2021 05:32:25 GMT
- Title: Evidential Softmax for Sparse Multimodal Distributions in Deep
Generative Models
- Authors: Phil Chen, Masha Itkina, Ransalu Senanayake, Mykel J. Kochenderfer
- Abstract summary: We present $textitev-softmax$, a sparse normalization function that preserves the multimodality of probability distributions.
We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive architectures.
- Score: 38.26333732364642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many applications of generative models rely on the marginalization of their
high-dimensional output probability distributions. Normalization functions that
yield sparse probability distributions can make exact marginalization more
computationally tractable. However, sparse normalization functions usually
require alternative loss functions for training since the log-likelihood is
undefined for sparse probability distributions. Furthermore, many sparse
normalization functions often collapse the multimodality of distributions. In
this work, we present $\textit{ev-softmax}$, a sparse normalization function
that preserves the multimodality of probability distributions. We derive its
properties, including its gradient in closed-form, and introduce a continuous
family of approximations to $\textit{ev-softmax}$ that have full support and
can be trained with probabilistic loss functions such as negative
log-likelihood and Kullback-Leibler divergence. We evaluate our method on a
variety of generative models, including variational autoencoders and
auto-regressive architectures. Our method outperforms existing dense and sparse
normalization techniques in distributional accuracy. We demonstrate that
$\textit{ev-softmax}$ successfully reduces the dimensionality of probability
distributions while maintaining multimodality.
Related papers
- MultiMax: Sparse and Multi-Modal Attention Learning [60.49318008131978]
SoftMax is a ubiquitous ingredient of modern machine learning algorithms.
We show that sparsity can be achieved by a family of SoftMax variants, but they often require an alternative loss function and do not preserve multi-modality.
We propose MultiMax, which adaptively modulates the output distribution according to input entry range.
arXiv Detail & Related papers (2024-06-03T10:51:43Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - A Pseudo-Semantic Loss for Autoregressive Models with Logical
Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning.
We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution.
We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z) - Normalizing flow sampling with Langevin dynamics in the latent space [12.91637880428221]
Normalizing flows (NF) use a continuous generator to map a simple latent (e.g. Gaussian) distribution, towards an empirical target distribution associated with a training data set.
Since standard NF implement differentiable maps, they may suffer from pathological behaviors when targeting complex distributions.
This paper proposes a new Markov chain Monte Carlo algorithm to sample from the target distribution in the latent domain before transporting it back to the target domain.
arXiv Detail & Related papers (2023-05-20T09:31:35Z) - Ensemble Multi-Quantiles: Adaptively Flexible Distribution Prediction
for Uncertainty Quantification [4.728311759896569]
We propose a novel, succinct, and effective approach for distribution prediction to quantify uncertainty in machine learning.
It incorporates adaptively flexible distribution prediction of $mathbbP(mathbfy|mathbfX=x)$ in regression tasks.
On extensive regression tasks from UCI datasets, we show that EMQ achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-11-26T11:45:32Z) - Matching Normalizing Flows and Probability Paths on Manifolds [57.95251557443005]
Continuous Normalizing Flows (CNFs) are generative models that transform a prior distribution to a model distribution by solving an ordinary differential equation (ODE)
We propose to train CNFs by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path.
We show that CNFs learned by minimizing PPD achieve state-of-the-art results in likelihoods and sample quality on existing low-dimensional manifold benchmarks.
arXiv Detail & Related papers (2022-07-11T08:50:19Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Probabilistic Kolmogorov-Arnold Network [1.4732811715354455]
The present paper proposes a method for estimating probability distributions of the outputs in the case of aleatoric uncertainty.
The suggested approach covers input-dependent probability distributions of the outputs, as well as the variation of the distribution type with the inputs.
Although the method is applicable to any regression model, the present paper combines it with KANs, since the specific structure of KANs leads to computationally-efficient models' construction.
arXiv Detail & Related papers (2021-04-04T23:49:15Z) - A method to integrate and classify normal distributions [0.0]
We present results and open-source software that provide the probability in any domain of a normal in any dimensions with any parameters.
We demonstrate these tools with vision research applications of detecting occluding objects in natural scenes, and detecting camouflage.
arXiv Detail & Related papers (2020-12-23T05:45:41Z) - Exploring Maximum Entropy Distributions with Evolutionary Algorithms [0.0]
We show how to evolve numerically the maximum entropy probability distributions for a given set of constraints.
An evolutionary algorithm can obtain approximations to some well-known analytical results.
We explain why many of the distributions are symmetrical and continuous, but some are not.
arXiv Detail & Related papers (2020-02-05T19:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.