Learning with Stochastic Orders
- URL: http://arxiv.org/abs/2205.13684v1
- Date: Fri, 27 May 2022 00:08:03 GMT
- Title: Learning with Stochastic Orders
- Authors: Carles Domingo-Enrich, Yair Schiff, Youssef Mroueh
- Abstract summary: Learning high-dimensional distributions is often done with explicit likelihood modeling or implicit modeling via integral probability metrics (IPMs)
We introduce the Choquet-Toland distance between probability measures, that can be used as a drop-in replacement for IPMsational.
We also introduce the Variational Dominance Criterion (VDC) to learn probability measures with dominance constraints.
- Score: 25.795107089736295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning high-dimensional distributions is often done with explicit
likelihood modeling or implicit modeling via minimizing integral probability
metrics (IPMs). In this paper, we expand this learning paradigm to stochastic
orders, namely, the convex or Choquet order between probability measures.
Towards this end, we introduce the Choquet-Toland distance between probability
measures, that can be used as a drop-in replacement for IPMs. We also introduce
the Variational Dominance Criterion (VDC) to learn probability measures with
dominance constraints, that encode the desired stochastic order between the
learned measure and a known baseline. We analyze both quantities and show that
they suffer from the curse of dimensionality and propose surrogates via input
convex maxout networks (ICMNs), that enjoy parametric rates. Finally, we
provide a min-max framework for learning with stochastic orders and validate it
experimentally on synthetic and high-dimensional image generation, with
promising results. The code is available at
https://github.com/yair-schiff/stochastic-orders-ICMN
Related papers
- A Method of Moments Embedding Constraint and its Application to Semi-Supervised Learning [2.8266810371534152]
Discnative deep learning models with a linear+softmax final layer have a problem.
Latent space only predicts the conditional probabilities $p(Y|X)$ but not the full joint distribution $p(Y,X)$.
This exacerbates model over-confidence impacting many problems, such as hallucinations, confounding biases, and dependence on large datasets.
arXiv Detail & Related papers (2024-04-27T18:41:32Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - Learning Distributions via Monte-Carlo Marginalization [9.131712404284876]
We propose a novel method to learn intractable distributions from their samples.
The Monte-Carlo Marginalization (MCMarg) is proposed to address this issue.
The proposed approach is a powerful tool to learn complex distributions and the entire process is differentiable.
arXiv Detail & Related papers (2023-08-11T19:08:06Z) - ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models [69.50316788263433]
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained vision-language models.
We quantify the calibration of embedding uncertainties in retrieval tasks and show that ProbVLM outperforms other methods.
We present a novel technique for visualizing the embedding distributions using a large-scale pre-trained latent diffusion model.
arXiv Detail & Related papers (2023-07-01T18:16:06Z) - Online Probabilistic Model Identification using Adaptive Recursive MCMC [8.465242072268019]
We suggest the Adaptive Recursive Markov Chain Monte Carlo (ARMCMC) method.
It eliminates the shortcomings of conventional online techniques while computing the entire probability density function of model parameters.
We demonstrate our approach using parameter estimation in a soft bending actuator and the Hunt-Crossley dynamic model.
arXiv Detail & Related papers (2022-10-23T02:06:48Z) - A Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning [49.999268109518255]
Proxy-based Deep Metric Learning learns by embedding images close to their class representatives (proxies)
In addition, proxy-based DML struggles to learn class-internal structures.
We introduce non-isotropic probabilistic proxy-based DML to address both issues.
arXiv Detail & Related papers (2022-07-08T09:34:57Z) - SIXO: Smoothing Inference with Twisted Objectives [8.049531918823758]
We introduce SIXO, a method that learns targets that approximate the smoothing distributions.
We then use SMC with these learned targets to define a variational objective for model and proposal learning.
arXiv Detail & Related papers (2022-06-13T07:46:35Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Sparse Communication via Mixed Distributions [29.170302047339174]
We build theoretical foundations for "mixed random variables"
Our framework suggests two strategies for representing and sampling mixed random variables.
We experiment with both approaches on an emergent communication benchmark.
arXiv Detail & Related papers (2021-08-05T14:49:03Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.