Related papers: Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression

Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression

URL: http://arxiv.org/abs/2305.00869v1
Date: Mon, 1 May 2023 15:10:56 GMT
Title: Estimating the Density Ratio between Distributions with High Discrepancy using Multinomial Logistic Regression
Authors: Akash Srivastava, Seungwook Han, Kai Xu, Benjamin Rhodes, Michael U. Gutmann
Abstract summary: We show that the state-of-the-art density ratio estimators perform poorly on well-separated cases. We present an alternative method that leverages multi-class classification for density ratio estimation.
Score: 21.758330613138778
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Functions of the ratio of the densities $p/q$ are widely used in machine learning to quantify the discrepancy between the two distributions $p$ and $q$. For high-dimensional distributions, binary classification-based density ratio estimators have shown great promise. However, when densities are well separated, estimating the density ratio with a binary classifier is challenging. In this work, we show that the state-of-the-art density ratio estimators perform poorly on well-separated cases and demonstrate that this is due to distribution shifts between training and evaluation time. We present an alternative method that leverages multi-class classification for density ratio estimation and does not suffer from distribution shift issues. The method uses a set of auxiliary densities $\{m_k\}_{k=1}^K$ and trains a multi-class logistic regression to classify the samples from $p, q$, and $\{m_k\}_{k=1}^K$ into $K+2$ classes. We show that if these auxiliary densities are constructed such that they overlap with $p$ and $q$, then a multi-class logistic regression allows for estimating $\log p/q$ on the domain of any of the $K+2$ distributions and resolves the distribution shift problems of the current state-of-the-art methods. We compare our method to state-of-the-art density ratio estimators on both synthetic and real datasets and demonstrate its superior performance on the tasks of density ratio estimation, mutual information estimation, and representation learning. Code: https://www.blackswhan.com/mdre/

Related papers

Two-sample comparison through additive tree models for density ratios [3.0262553206264893]
We propose algorithms for training additive tree models for the density ratio using a new loss function called the balancing loss.<n>We show that due to the loss function's resemblance to an exponential family kernel, the new loss can serve as a pseudo-likelihood for which conjugate priors exist.<n>We provide insights on the balancing loss through its close connection to the exponential loss in binary classification and to the variational form of f-divergence.
arXiv Detail & Related papers (2025-08-05T04:08:49Z)
DDPM Score Matching and Distribution Learning [24.341062891949953]
Score estimation is the backbone of score-based generative models (SGMs) This paper introduces a framework that reduces score estimation to tasks of parameter and density estimation. We provide minimax rates for density estimation over H" classes and a quasi-polynomial PAC density estimation algorithm.
arXiv Detail & Related papers (2025-04-07T15:07:19Z)
Mixture models for data with unknown distributions [0.6345523830122168]
We describe and analyze a broad class of mixture models for real-valued multivariate data. We return both a division of the data and an estimate of the distributions, effectively performing clustering and density estimation within each cluster at the same time. We demonstrate our methods with a selection of illustrative applications and give code implementing both algorithms.
arXiv Detail & Related papers (2025-02-26T22:42:40Z)
Instance-Optimal Private Density Estimation in the Wasserstein Distance [37.58527481568219]
Estimating the density of a distribution from samples is a fundamental problem in statistics. We study differentially private density estimation in the Wasserstein distance.
arXiv Detail & Related papers (2024-06-27T22:51:06Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data. Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z)
Data Structures for Density Estimation [66.36971978162461]
Given a sublinear (in $n$) number of samples from $p$, our main result is the first data structure that identifies $v_i$ in time sublinear in $k$. We also give an improved version of the algorithm of Acharya et al. that reports $v_i$ in time linear in $k$.
arXiv Detail & Related papers (2023-06-20T06:13:56Z)
A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms. We develop a general framework from the perspective of Bregman minimization divergence. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z)
Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems. Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions. We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z)
Rates of convergence for density estimation with generative adversarial networks [19.71040653379663]
We prove an oracle inequality for the Jensen-Shannon (JS) divergence between the underlying density $mathsfp*$ and the GAN estimate. We show that the JS-divergence between the GAN estimate and $mathsfp*$ decays as fast as $(logn/n)2beta/ (2beta + d)$.
arXiv Detail & Related papers (2021-01-30T09:59:14Z)
$(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics [6.221019624345409]
We develop a framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs) We show that they can be expressed as a two-stage mass-redistribution/mass-transport process. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions.
arXiv Detail & Related papers (2020-11-11T18:17:09Z)
TraDE: Transformers for Density Estimation [101.20137732920718]
TraDE is a self-attention-based architecture for auto-regressive density estimation. We present a suite of tasks such as regression using generated samples, out-of-distribution detection, and robustness to noise in the training data.
arXiv Detail & Related papers (2020-04-06T07:32:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.