Federated Prediction-Powered Inference from Decentralized Data
- URL: http://arxiv.org/abs/2409.01730v1
- Date: Tue, 3 Sep 2024 09:14:18 GMT
- Title: Federated Prediction-Powered Inference from Decentralized Data
- Authors: Ping Luo, Xiaoge Deng, Ziqing Wen, Tao Sun, Dongsheng Li,
- Abstract summary: Prediction-Powered Inference (PPI) has been proposed to ensure statistical validity despite the unreliability.
The Fed-PPI framework involves training local models on private data, aggregating them through Federated Learning (FL), and deriving confidence intervals using PPI.
- Score: 40.84399531998246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In various domains, the increasing application of machine learning allows researchers to access inexpensive predictive data, which can be utilized as auxiliary data for statistical inference. Although such data are often unreliable compared to gold-standard datasets, Prediction-Powered Inference (PPI) has been proposed to ensure statistical validity despite the unreliability. However, the challenge of `data silos' arises when the private gold-standard datasets are non-shareable for model training, leading to less accurate predictive models and invalid inferences. In this paper, we introduces the Federated Prediction-Powered Inference (Fed-PPI) framework, which addresses this challenge by enabling decentralized experimental data to contribute to statistically valid conclusions without sharing private information. The Fed-PPI framework involves training local models on private data, aggregating them through Federated Learning (FL), and deriving confidence intervals using PPI computation. The proposed framework is evaluated through experiments, demonstrating its effectiveness in producing valid confidence intervals.
Related papers
- Beyond Conformal Predictors: Adaptive Conformal Inference with Confidence Predictors [0.0]
Conformal prediction requires exchangeable data to ensure valid prediction sets at a user-specified significance level.
Adaptive conformal inference (ACI) was introduced to address this limitation.
We show that ACI does not require the use of conformal predictors; instead, it can be implemented with the more general confidence predictors.
arXiv Detail & Related papers (2024-09-23T21:02:33Z) - Uncertainty Quantification of Data Shapley via Statistical Inference [20.35973700939768]
The emergence of data markets underscores the growing importance of data valuation.
Within the machine learning landscape, Data Shapley stands out as a widely embraced method for data valuation.
This paper establishes the relationship between Data Shapley and infinite-order U-statistics.
arXiv Detail & Related papers (2024-07-28T02:54:27Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a method called Stratified Prediction-Powered Inference (StratPPI)
We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Conditional Density Estimations from Privacy-Protected Data [0.0]
We propose simulation-based inference methods from privacy-protected datasets.
We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models.
arXiv Detail & Related papers (2023-10-19T14:34:17Z) - Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model.
We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - Federated Estimation of Causal Effects from Observational Data [19.657789891394504]
We present a novel framework for causal inference with federated data sources.
We assess and integrate local causal effects from different private data sources without centralizing them.
arXiv Detail & Related papers (2021-05-31T08:06:00Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.