A Unified Framework for Multi-distribution Density Ratio Estimation
- URL: http://arxiv.org/abs/2112.03440v1
- Date: Tue, 7 Dec 2021 01:23:20 GMT
- Title: A Unified Framework for Multi-distribution Density Ratio Estimation
- Authors: Lantao Yu, Yujia Jin, Stefano Ermon
- Abstract summary: Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
- Score: 101.67420298343512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary density ratio estimation (DRE), the problem of estimating the ratio
$p_1/p_2$ given their empirical samples, provides the foundation for many
state-of-the-art machine learning algorithms such as contrastive representation
learning and covariate shift adaptation. In this work, we consider a
generalized setting where given samples from multiple distributions $p_1,
\ldots, p_k$ (for $k > 2$), we aim to efficiently estimate the density ratios
between all pairs of distributions. Such a generalization leads to important
new applications such as estimating statistical discrepancy among multiple
random variables like multi-distribution $f$-divergence, and bias correction
via multiple importance sampling. We then develop a general framework from the
perspective of Bregman divergence minimization, where each strictly convex
multivariate function induces a proper loss for multi-distribution DRE.
Moreover, we rederive the theoretical connection between multi-distribution
density ratio estimation and class probability estimation, justifying the use
of any strictly proper scoring rule composite with a link function for
multi-distribution DRE. We show that our framework leads to methods that
strictly generalize their counterparts in binary DRE, as well as new methods
that show comparable or superior performance on various downstream tasks.
Related papers
- Improving Distribution Alignment with Diversity-based Sampling [0.0]
Domain shifts are ubiquitous in machine learning, and can substantially degrade a model's performance when deployed to real-world data.
This paper proposes to improve these estimates by inducing diversity in each sampled minibatch.
It simultaneously balances the data and reduces the variance of the gradients, thereby enhancing the model's generalisation ability.
arXiv Detail & Related papers (2024-10-05T17:26:03Z) - Multiple importance sampling for stochastic gradient estimation [33.42221341526944]
We introduce a theoretical and practical framework for efficient importance sampling of mini-batch samples for gradient estimation.
To handle noisy gradients, our framework dynamically evolves the importance distribution during training by utilizing a self-adaptive metric.
arXiv Detail & Related papers (2024-07-22T10:28:56Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal
Output Distributions [22.431244647796582]
This paper presents a Mixture of Multiple-Output functions (MoM) approach using a novel variant of dropout, Multiple Hypothesis Dropout.
Experiments on supervised learning problems illustrate that our approach outperforms existing solutions for reconstructing multimodal output distributions.
Additional studies on unsupervised learning problems show that estimating the parameters of latent posterior distributions within a discrete autoencoder significantly improves codebook efficiency, sample quality, precision and recall.
arXiv Detail & Related papers (2023-12-18T22:20:11Z) - Estimating the Density Ratio between Distributions with High Discrepancy
using Multinomial Logistic Regression [21.758330613138778]
We show that the state-of-the-art density ratio estimators perform poorly on well-separated cases.
We present an alternative method that leverages multi-class classification for density ratio estimation.
arXiv Detail & Related papers (2023-05-01T15:10:56Z) - Causal Balancing for Domain Generalization [95.97046583437145]
We propose a balanced mini-batch sampling strategy to reduce the domain-specific spurious correlations in observed training distributions.
We provide an identifiability guarantee of the source of spuriousness and show that our proposed approach provably samples from a balanced, spurious-free distribution.
arXiv Detail & Related papers (2022-06-10T17:59:11Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains.
We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.