Arbitrary Conditional Distributions with Energy
- URL: http://arxiv.org/abs/2102.04426v1
- Date: Mon, 8 Feb 2021 18:36:26 GMT
- Title: Arbitrary Conditional Distributions with Energy
- Authors: Ryan R. Strauss, Junier B. Oliva
- Abstract summary: A more general and useful problem is arbitrary conditional density estimation.
We propose a novel method, Arbitrary Conditioning with Energy (ACE), that can simultaneously estimate the distribution $p(mathbfx_u mid mathbfx_o)$.
We also simplify the learning problem by only learning one-dimensional conditionals, from which more complex distributions can be recovered during inference.
- Score: 11.081460215563633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modeling distributions of covariates, or density estimation, is a core
challenge in unsupervised learning. However, the majority of work only
considers the joint distribution, which has limited relevance to practical
situations. A more general and useful problem is arbitrary conditional density
estimation, which aims to model any possible conditional distribution over a
set of covariates, reflecting the more realistic setting of inference based on
prior knowledge. We propose a novel method, Arbitrary Conditioning with Energy
(ACE), that can simultaneously estimate the distribution $p(\mathbf{x}_u \mid
\mathbf{x}_o)$ for all possible subsets of features $\mathbf{x}_u$ and
$\mathbf{x}_o$. ACE uses an energy function to specify densities, bypassing the
architectural restrictions imposed by alternative methods and the biases
imposed by tractable parametric distributions. We also simplify the learning
problem by only learning one-dimensional conditionals, from which more complex
distributions can be recovered during inference. Empirically, we show that ACE
achieves state-of-the-art for arbitrary conditional and marginal likelihood
estimation and for tabular data imputation.
Related papers
- Gradual Domain Adaptation via Manifold-Constrained Distributionally Robust Optimization [0.4732176352681218]
This paper addresses the challenge of gradual domain adaptation within a class of manifold-constrained data distributions.
We propose a methodology rooted in Distributionally Robust Optimization (DRO) with an adaptive Wasserstein radius.
Our bounds rely on a newly introduced it compatibility measure, which fully characterizes the error propagation dynamics along the sequence.
arXiv Detail & Related papers (2024-10-17T22:07:25Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV)
We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z) - Score-based generative models are provably robust: an uncertainty quantification perspective [4.396860522241307]
We show that score-based generative models (SGMs) are provably robust to the multiple sources of error in practical implementation.
Our primary tool is the Wasserstein uncertainty propagation (WUP) theorem.
We show how errors due to (a) finite sample approximation, (b) early stopping, (c) score-matching objective choice, (d) score function parametrization, and (e) reference distribution choice, impact the quality of the generative model.
arXiv Detail & Related papers (2024-05-24T17:50:17Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Simple Binary Hypothesis Testing under Local Differential Privacy and
Communication Constraints [8.261182037130407]
We study simple binary hypothesis testing under both local differential privacy (LDP) and communication constraints.
We qualify our results as either minimax optimal or instance optimal.
arXiv Detail & Related papers (2023-01-09T18:36:49Z) - On counterfactual inference with unobserved confounding [36.18241676876348]
Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit.
We introduce a convex objective that pools all $n$ samples to jointly learn all $n$ parameter vectors.
We derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality.
arXiv Detail & Related papers (2022-11-14T04:14:37Z) - Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$.
The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z) - Predicting conditional probability distributions of redshifts of Active
Galactic Nuclei using Hierarchical Correlation Reconstruction [0.8702432681310399]
This article applies Hierarchical Correlation Reconstruction (HCR) approach to inexpensively predict conditional probability distributions.
We get interpretable models: with coefficients describing contributions of features to conditional moments.
This article extends on the original approach especially by using Canonical Correlation Analysis (CCA) for feature optimization and l1 "lasso" regularization.
arXiv Detail & Related papers (2022-06-13T14:28:53Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - On the Generative Utility of Cyclic Conditionals [103.1624347008042]
We study whether and how can we model a joint distribution $p(x,z)$ using two conditional models $p(x|z)$ that form a cycle.
We propose the CyGen framework for cyclic-conditional generative modeling, including methods to enforce compatibility and use the determined distribution to fit and generate data.
arXiv Detail & Related papers (2021-06-30T10:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.