Related papers: Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory

Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory

URL: http://arxiv.org/abs/2308.01853v2
Date: Tue, 10 Oct 2023 00:40:47 GMT
Title: Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory
Authors: Patrick Chao, Edgar Dobriban
Abstract summary: We focus on Wasserstein distribution shifts, where every data point may undergo a slight perturbation. We consider perturbations that are either independent or coordinated joint shifts across data points. We analyze several important statistical problems, including location estimation, linear regression, and non-parametric density estimation.
Score: 24.540342159350015
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Distribution shifts are a serious concern in modern statistical learning as they can systematically change the properties of the data away from the truth. We focus on Wasserstein distribution shifts, where every data point may undergo a slight perturbation, as opposed to the Huber contamination model where a fraction of observations are outliers. We consider perturbations that are either independent or coordinated joint shifts across data points. We analyze several important statistical problems, including location estimation, linear regression, and non-parametric density estimation. Under a squared loss for mean estimation and prediction error in linear regression, we find the exact minimax risk, a least favorable perturbation, and show that the sample mean and least squares estimators are respectively optimal. For other problems, we provide nearly optimal estimators and precise finite-sample bounds. We also introduce several tools for bounding the minimax risk under general distribution shifts, not just for Wasserstein perturbations, such as a smoothing technique for location families, and generalizations of classical tools including least favorable sequences of priors, the modulus of continuity, as well as Le Cam's, Fano's, and Assouad's methods.

Related papers

Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs. We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint. We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space. We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Keep it Tighter -- A Story on Analytical Mean Embeddings [0.6445605125467574]
Kernel techniques are among the most popular and flexible approaches in data science. Mean embedding gives rise to a divergence measure referred to as maximum mean discrepancy (MMD) In this paper we focus on the problem of MMD estimation when the mean embedding of one of the underlying distributions is available analytically.
arXiv Detail & Related papers (2021-10-15T21:29:27Z)
Non asymptotic estimation lower bounds for LTI state space models with Cram\'er-Rao and van Trees [1.14219428942199]
We study the estimation problem for linear time-invariant (LTI) state-space models with Gaussian excitation of an unknown covariance. We provide non lower bounds for the expected estimation error and the mean square estimation risk of the least square estimator. Our results extend and improve existing lower bounds to lower bounds in expectation of the mean square estimation risk.
arXiv Detail & Related papers (2021-09-17T15:00:25Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
Near-Optimal Linear Regression under Distribution Shift [63.87137348308034]
We show that linear minimax estimators are within an absolute constant of the minimax risk even among nonlinear estimators for various source/target distributions.
arXiv Detail & Related papers (2021-06-23T00:52:50Z)
SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets. Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z)
Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport. We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z)
Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in High Dimensions [41.7567932118769]
Empirical Risk Minimization algorithms are widely used in a variety of estimation and prediction tasks. In this paper, we characterize for the first time the fundamental limits on the statistical accuracy of convex ERM for inference.
arXiv Detail & Related papers (2020-06-16T04:27:38Z)
On lower bounds for the bias-variance trade-off [0.0]
It is a common phenomenon that for high-dimensional statistical models, rate-optimal estimators balance squared bias and variance. We propose a general strategy to obtain lower bounds on the variance of any estimator with bias smaller than a prespecified bound. This shows to which extent the bias-variance trade-off is unavoidable and allows to quantify the loss of performance for methods that do not obey it.
arXiv Detail & Related papers (2020-05-30T14:07:43Z)
Minimax Semiparametric Learning With Approximate Sparsity [3.5136198842746524]
This paper formalizes the concept of approximate model sparsity through classical semi-parametric theory.<n>We derive minimax rates for a regression slope and an average derivative, finding these bounds to be substantially larger than those in low-dimensional, semi-parametric settings.
arXiv Detail & Related papers (2019-12-27T16:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.