On the Generative Utility of Cyclic Conditionals
- URL: http://arxiv.org/abs/2106.15962v1
- Date: Wed, 30 Jun 2021 10:23:45 GMT
- Title: On the Generative Utility of Cyclic Conditionals
- Authors: Chang Liu, Haoyue Tang, Tao Qin, Jintao Wang, Tie-Yan Liu
- Abstract summary: We study whether and how can we model a joint distribution $p(x,z)$ using two conditional models $p(x|z)$ that form a cycle.
We propose the CyGen framework for cyclic-conditional generative modeling, including methods to enforce compatibility and use the determined distribution to fit and generate data.
- Score: 103.1624347008042
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study whether and how can we model a joint distribution $p(x,z)$ using two
conditional models $p(x|z)$ and $q(z|x)$ that form a cycle. This is motivated
by the observation that deep generative models, in addition to a likelihood
model $p(x|z)$, often also use an inference model $q(z|x)$ for data
representation, but they rely on a usually uninformative prior distribution
$p(z)$ to define a joint distribution, which may render problems like posterior
collapse and manifold mismatch. To explore the possibility to model a joint
distribution using only $p(x|z)$ and $q(z|x)$, we study their compatibility and
determinacy, corresponding to the existence and uniqueness of a joint
distribution whose conditional distributions coincide with them. We develop a
general theory for novel and operable equivalence criteria for compatibility,
and sufficient conditions for determinacy. Based on the theory, we propose the
CyGen framework for cyclic-conditional generative modeling, including methods
to enforce compatibility and use the determined distribution to fit and
generate data. With the prior constraint removed, CyGen better fits data and
captures more representative features, supported by experiments showing better
generation and downstream classification performance.
Related papers
- A Sharp Convergence Theory for The Probability Flow ODEs of Diffusion Models [45.60426164657739]
We develop non-asymptotic convergence theory for a diffusion-based sampler.
We prove that $d/varepsilon$ are sufficient to approximate the target distribution to within $varepsilon$ total-variation distance.
Our results also characterize how $ell$ score estimation errors affect the quality of the data generation processes.
arXiv Detail & Related papers (2024-08-05T09:02:24Z) - Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$.
The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z) - Convergence for score-based generative modeling with polynomial
complexity [9.953088581242845]
We prove the first convergence guarantees for the core mechanic behind Score-based generative modeling.
Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality.
We show that a predictor-corrector gives better convergence than using either portion alone.
arXiv Detail & Related papers (2022-06-13T14:57:35Z) - $p$-Generalized Probit Regression and Scalable Maximum Likelihood
Estimation via Sketching and Coresets [74.37849422071206]
We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses.
We show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+varepsilon)$ on large data.
arXiv Detail & Related papers (2022-03-25T10:54:41Z) - Any Variational Autoencoder Can Do Arbitrary Conditioning [7.96091289659041]
Posterior Matching enables any Variational Autoencoder to perform arbitrary conditioning without modification to the VAE itself.
We find that Posterior Matching achieves performance that is comparable or superior to current state-of-the-art methods for a variety of tasks.
arXiv Detail & Related papers (2022-01-28T20:48:44Z) - Universal and data-adaptive algorithms for model selection in linear
contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem.
We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$.
Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z) - PSD Representations for Effective Probability Models [117.35298398434628]
We show that a recently proposed class of positive semi-definite (PSD) models for non-negative functions is particularly suited to this end.
We characterize both approximation and generalization capabilities of PSD models, showing that they enjoy strong theoretical guarantees.
Our results open the way to applications of PSD models to density estimation, decision theory and inference.
arXiv Detail & Related papers (2021-06-30T15:13:39Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Probability Link Models with Symmetric Information Divergence [1.5749416770494706]
Two general classes of link models are proposed.
The first model links two survival functions and is applicable to models such as the proportional odds and change point.
The second model links two cumulative probability distribution functions.
arXiv Detail & Related papers (2020-08-10T19:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.