Prediction-Powered Inference with Inverse Probability Weighting
- URL: http://arxiv.org/abs/2508.10149v1
- Date: Wed, 13 Aug 2025 19:25:38 GMT
- Title: Prediction-Powered Inference with Inverse Probability Weighting
- Authors: Jyotishka Datta, Nicholas G. Polson,
- Abstract summary: Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data.<n>We show that PPI can be extended to handle informative labeling by replacing its unweighted bias-correction term with an inverse probability weighted (IPW) version.
- Score: 0.4987670632802289
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data, combining model-based predictions on a large unlabeled set with bias correction from a smaller labeled subset. We show that PPI can be extended to handle informative labeling by replacing its unweighted bias-correction term with an inverse probability weighted (IPW) version, using the classical Horvitz--Thompson or H\'ajek forms. This connection unites design-based survey sampling ideas with modern prediction-assisted inference, yielding estimators that remain valid when labeling probabilities vary across units. We consider the common setting where the inclusion probabilities are not known but estimated from a correctly specified model. In simulations, the performance of IPW-adjusted PPI with estimated propensities closely matches the known-probability case, retaining both nominal coverage and the variance-reduction benefits of PPI.
Related papers
- Generalized Prediction-Powered Inference, with Application to Binary Classifier Evaluation [0.0]
We generalize PPI to any regularally linear estimator.<n>We show that PPI does not achieve the semi-parametric efficiency lower bound outside of very restrictive and unrealistic scenarios.<n>We exploit connections to that literature to propose modified PPI estimators.
arXiv Detail & Related papers (2026-02-10T22:11:26Z) - DistDF: Time-Series Forecasting Needs Joint-Distribution Wasserstein Alignment [92.70019102733453]
Training time-series forecast models requires aligning the conditional distribution of model forecasts with that of the label sequence.<n>We propose DistDF, which achieves alignment by alternatively minimizing a discrepancy between the conditional forecast and label distributions.
arXiv Detail & Related papers (2025-10-28T16:09:59Z) - Conformal Inference for Open-Set and Imbalanced Classification [17.863428471982967]
This paper presents a conformal prediction method for classification in highly imbalanced and open-set settings.<n>Existing approaches require a finite, known label space and typically involve random sample splitting.<n>We compute and integrate into our predictions a new family of conformal p-values that can test whether a new data point belongs to a previously unseen class.
arXiv Detail & Related papers (2025-10-14T23:19:06Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.<n>We propose a method called Stratified Prediction-Powered Inference (StratPPI)<n>We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Measuring Stochastic Data Complexity with Boltzmann Influence Functions [12.501336941823627]
Estimating uncertainty of a model's prediction on a test point is a crucial part of ensuring reliability and calibration under distribution shifts.
We propose IF-COMP, a scalable and efficient approximation of the pNML distribution that linearizes the model with a temperature-scaled Boltzmann influence function.
We experimentally validate IF-COMP on uncertainty calibration, mislabel detection, and OOD detection tasks, where it consistently matches or beats strong baseline methods.
arXiv Detail & Related papers (2024-06-04T20:01:39Z) - Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample.
Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications.
We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z) - Adaptive Conformal Prediction by Reweighting Nonconformity Score [0.0]
We use a Quantile Regression Forest (QRF) to learn the distribution of nonconformity scores and utilize the QRF's weights to assign more importance to samples with residuals similar to the test point.
Our approach enjoys an assumption-free finite sample marginal and training-conditional coverage, and under suitable assumptions, it also ensures conditional coverage.
arXiv Detail & Related papers (2023-03-22T16:42:19Z) - Probabilistic Conformal Prediction Using Conditional Random Samples [73.26753677005331]
PCP is a predictive inference algorithm that estimates a target variable by a discontinuous predictive set.
It is efficient and compatible with either explicit or implicit conditional generative models.
arXiv Detail & Related papers (2022-06-14T03:58:03Z) - Bayes in Wonderland! Predictive supervised classification inference hits
unpredictability [1.8814209805277506]
We show the convergence of the sBpc and mBpc under de Finetti type of exchangeability.
We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence.
arXiv Detail & Related papers (2021-12-03T12:34:52Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.