Estimating Unbounded Density Ratios: Applications in Error Control under Covariate Shift
- URL: http://arxiv.org/abs/2504.01031v1
- Date: Sat, 29 Mar 2025 11:35:39 GMT
- Title: Estimating Unbounded Density Ratios: Applications in Error Control under Covariate Shift
- Authors: Shuntuo Xu, Zhou Yu, Jian Huang,
- Abstract summary: We study density ratio estimators using loss functions based on at least squares and logistic regression.<n>We establish upper bounds on estimation errors with standard minimax optimal rates, up to logarithmic factors.<n>Our results accommodate density ratio functions with unbounded domains and ranges.
- Score: 17.924340241624204
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The density ratio is an important metric for evaluating the relative likelihood of two probability distributions, with extensive applications in statistics and machine learning. However, existing estimation theories for density ratios often depend on stringent regularity conditions, mainly focusing on density ratio functions with bounded domains and ranges. In this paper, we study density ratio estimators using loss functions based on least squares and logistic regression. We establish upper bounds on estimation errors with standard minimax optimal rates, up to logarithmic factors. Our results accommodate density ratio functions with unbounded domains and ranges. We apply our results to nonparametric regression and conditional flow models under covariate shift and identify the tail properties of the density ratio as crucial for error control across domains affected by covariate shift. We provide sufficient conditions under which loss correction is unnecessary and demonstrate effective generalization capabilities of a source estimator to any suitable target domain. Our simulation experiments support these theoretical findings, indicating that the source estimator can outperform those derived from loss correction methods, even when the true density ratio is known.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
We construct a conformal prediction set accounting for the joint correlation structure of the vector-valued non-conformity scores.
We flexibly estimate the joint cumulative distribution function (CDF) of the scores.
Our method yields desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density [93.32594873253534]
Trustworthy machine learning requires meticulous regulation of model reliance on non-robust features.
We propose a framework to delineate and regulate such features by attributing model predictions to the input.
arXiv Detail & Related papers (2024-07-05T09:16:56Z) - Binary Losses for Density Ratio Estimation [2.512309434783062]
Estimating the ratio of two probability densities from a finite number of observations is a central machine learning problem.
In this work, we characterize all loss functions that result in density ratio estimators with small error.
We obtain a simple recipe for constructing loss functions with certain properties, such as those that prioritize an accurate estimation of large density ratio values.
arXiv Detail & Related papers (2024-07-01T15:24:34Z) - Overcoming Saturation in Density Ratio Estimation by Iterated Regularization [11.244546184962996]
We show that a class of kernel methods for density ratio estimation suffers from error saturation.
We introduce iterated regularization in density ratio estimation to achieve fast error rates.
arXiv Detail & Related papers (2024-02-21T16:02:14Z) - Double Debiased Covariate Shift Adaptation Robust to Density-Ratio Estimation [7.8856737627153874]
We propose a doubly robust estimator for covariate shift adaptation via importance weighting.
Our estimator reduces the bias arising from the density ratio estimation errors.
Notably, our estimator remains consistent if either the density ratio estimator or the regression function is consistent.
arXiv Detail & Related papers (2023-10-25T13:38:29Z) - Adaptive learning of density ratios in RKHS [3.047411947074805]
Estimating the ratio of two probability densities from finitely many observations is a central problem in machine learning and statistics.
We analyze a large class of density ratio estimation methods that minimize a regularized Bregman divergence between the true density ratio and a model in a reproducing kernel Hilbert space.
arXiv Detail & Related papers (2023-07-30T08:18:39Z) - Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples.
To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution.
We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - MCD: Marginal Contrastive Discrimination for conditional density
estimation [0.0]
Marginal Contrastive Discrimination, MCD, reformulates the conditional density function into two factors, the marginal density function of the target variable and a ratio of density functions.
Our benchmark reveals that our method significantly outperforms in practice existing methods on most density models and regression datasets.
arXiv Detail & Related papers (2022-06-03T14:22:29Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - TraDE: Transformers for Density Estimation [101.20137732920718]
TraDE is a self-attention-based architecture for auto-regressive density estimation.
We present a suite of tasks such as regression using generated samples, out-of-distribution detection, and robustness to noise in the training data.
arXiv Detail & Related papers (2020-04-06T07:32:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.