Prediction-Powered Inference
- URL: http://arxiv.org/abs/2301.09633v4
- Date: Thu, 9 Nov 2023 17:48:20 GMT
- Title: Prediction-Powered Inference
- Authors: Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I.
Jordan, Tijana Zrnic
- Abstract summary: Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system.
The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients.
Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning.
- Score: 68.97619568620709
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prediction-powered inference is a framework for performing valid statistical
inference when an experimental dataset is supplemented with predictions from a
machine-learning system. The framework yields simple algorithms for computing
provably valid confidence intervals for quantities such as means, quantiles,
and linear and logistic regression coefficients, without making any assumptions
on the machine-learning algorithm that supplies the predictions. Furthermore,
more accurate predictions translate to smaller confidence intervals.
Prediction-powered inference could enable researchers to draw valid and more
data-efficient conclusions using machine learning. The benefits of
prediction-powered inference are demonstrated with datasets from proteomics,
astronomy, genomics, remote sensing, census analysis, and ecology.
Related papers
- Do We Really Even Need Data? [2.3749120526936465]
Researchers increasingly use predictions from pre-trained algorithms as outcome variables.
Standard tools for inference can misrepresent the association between independent variables and the outcome of interest when the true, unobserved outcome is replaced by a predicted value.
arXiv Detail & Related papers (2024-01-14T23:19:21Z) - PPI++: Efficient Prediction-Powered Inference [31.403415618169433]
We present PPI++: a methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions.
The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets.
PPI++ builds on prediction-powered inference (PPI), which targets the same problem setting, improving its computational and statistical efficiency.
arXiv Detail & Related papers (2023-11-02T17:59:04Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - On the Relation between Prediction and Imputation Accuracy under Missing
Covariates [0.0]
Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for imputation.
Recent research has realized an increasing trend towards the usage of modern Machine Learning algorithms for imputation.
arXiv Detail & Related papers (2021-12-09T23:30:44Z) - Reliable Probability Intervals For Classification Using Inductive Venn
Predictors Based on Distance Learning [2.66512000865131]
We use the Inductive Venn Predictors framework for computing probability intervals regarding the correctness of each prediction in real-time.
We propose based on distance metric learning to compute informative probability intervals in applications involving high-dimensional inputs.
arXiv Detail & Related papers (2021-10-07T00:51:43Z) - Hessian-based toolbox for reliable and interpretable machine learning in
physics [58.720142291102135]
We present a toolbox for interpretability and reliability, extrapolation of the model architecture.
It provides a notion of the influence of the input data on the prediction at a given test point, an estimation of the uncertainty of the model predictions, and an agnostic score for the model predictions.
Our work opens the road to the systematic use of interpretability and reliability methods in ML applied to physics and, more generally, science.
arXiv Detail & Related papers (2021-08-04T16:32:59Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Cross-Validation and Uncertainty Determination for Randomized Neural
Networks with Applications to Mobile Sensors [0.0]
Extreme learning machines provide an attractive and efficient method for supervised learning under limited computing ressources and green machine learning.
Results are discussed about supervised learning with such networks and regression methods in terms of consistency and bounds for the generalization and prediction error.
arXiv Detail & Related papers (2021-01-06T12:28:06Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.