Mixed Variational Flows for Discrete Variables
- URL: http://arxiv.org/abs/2308.15613v3
- Date: Mon, 26 Feb 2024 14:55:23 GMT
- Title: Mixed Variational Flows for Discrete Variables
- Authors: Gian Carlo Diluvi, Benjamin Bloem-Reddy, Trevor Campbell
- Abstract summary: We develop a variational flow family for discrete distributions without any continuous embedding.
First, we develop a measure-preserving and discrete (MAD) invertible map that leaves the discrete target invariant.
We also develop an extension to MAD Mix that handles joint discrete and continuous models.
- Score: 14.00384446902181
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational flows allow practitioners to learn complex continuous
distributions, but approximating discrete distributions remains a challenge.
Current methodologies typically embed the discrete target in a continuous space
- usually via continuous relaxation or dequantization - and then apply a
continuous flow. These approaches involve a surrogate target that may not
capture the original discrete target, might have biased or unstable gradients,
and can create a difficult optimization problem. In this work, we develop a
variational flow family for discrete distributions without any continuous
embedding. First, we develop a measure-preserving and discrete (MAD) invertible
map that leaves the discrete target invariant, and then create a mixed
variational flow (MAD Mix) based on that map. Our family provides access to
i.i.d. sampling and density evaluation with virtually no tuning effort. We also
develop an extension to MAD Mix that handles joint discrete and continuous
models. Our experiments suggest that MAD Mix produces more reliable
approximations than continuous-embedding flows while being significantly faster
to train.
Related papers
- (De)-regularized Maximum Mean Discrepancy Gradient Flow [27.70783952195201]
We introduce a (de)-regularization of the Maximum Mean Discrepancy (DrMMD) and its Wasserstein gradient flow.
DrMMD flow can simultaneously guarantee near-global convergence for a broad class of targets in both continuous and discrete time.
Our numerical scheme uses an adaptive de-regularization schedule throughout the flow to optimally trade off between discretization errors and deviations from the $chi2$ regime.
arXiv Detail & Related papers (2024-09-23T12:57:42Z) - Marginalization Consistent Mixture of Separable Flows for Probabilistic Irregular Time Series Forecasting [4.714246221974192]
We develop a novel probabilistic irregular time series forecasting model, Marginalization Consistent Mixtures of Separable Flows (moses)
moses outperforms other state-of-the-art marginalization consistent models, performs on par with ProFITi, but different from ProFITi, guarantee marginalization consistency.
arXiv Detail & Related papers (2024-06-11T13:28:43Z) - Adversarial Schrödinger Bridge Matching [66.39774923893103]
Iterative Markovian Fitting (IMF) procedure alternates between Markovian and reciprocal projections of continuous-time processes.
We propose a novel Discrete-time IMF (D-IMF) procedure in which learning of processes is replaced by learning just a few transition probabilities in discrete time.
We show that our D-IMF procedure can provide the same quality of unpaired domain translation as the IMF, using only several generation steps instead of hundreds.
arXiv Detail & Related papers (2024-05-23T11:29:33Z) - Unified Discrete Diffusion for Categorical Data [37.56355078250024]
We present a series of mathematical simplifications of the variational lower bound that enable more accurate and easy-to-optimize training for discrete diffusion.
We derive a simple formulation for backward denoising that enables exact and accelerated sampling, and importantly, an elegant unification of discrete-time and continuous-time discrete diffusion.
arXiv Detail & Related papers (2024-02-06T04:42:36Z) - Multi-scale Diffusion Denoised Smoothing [79.95360025953931]
randomized smoothing has become one of a few tangible approaches that offers adversarial robustness to models at scale.
We present scalable methods to address the current trade-off between certified robustness and accuracy in denoised smoothing.
Our experiments show that the proposed multi-scale smoothing scheme combined with diffusion fine-tuning enables strong certified robustness available with high noise level.
arXiv Detail & Related papers (2023-10-25T17:11:21Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces [0.0]
We develop a theoretical formulation for arbitrary discrete-state Markov processes in the forward diffusion process.
As an example, we introduce Blackout Diffusion'', which learns to produce samples from an empty image instead of from noise.
arXiv Detail & Related papers (2023-05-18T16:24:12Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Discrete Denoising Flows [87.44537620217673]
We introduce a new discrete flow-based model for categorical random variables: Discrete Denoising Flows (DDFs)
In contrast with other discrete flow-based models, our model can be locally trained without introducing gradient bias.
We show that DDFs outperform Discrete Flows on modeling a toy example, binary MNIST and Cityscapes segmentation maps, measured in log-likelihood.
arXiv Detail & Related papers (2021-07-24T14:47:22Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z) - Reliable Categorical Variational Inference with Mixture of Discrete
Normalizing Flows [10.406659081400354]
Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling.
Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations.
In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one.
arXiv Detail & Related papers (2020-06-28T10:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.