Data-Free Evaluation of User Contributions in Federated Learning
- URL: http://arxiv.org/abs/2108.10623v1
- Date: Tue, 24 Aug 2021 10:17:03 GMT
- Title: Data-Free Evaluation of User Contributions in Federated Learning
- Authors: Hongtao Lv, Zhenzhe Zheng, Tie Luo, Fan Wu, Shaojie Tang, Lifeng Hua,
Rongfei Jia, Chengfei Lv
- Abstract summary: Federated learning (FL) trains a machine learning model on mobile devices in a distributed manner using each device's private data and computing resources.
We propose a method called Pairwise Correlated Agreement (PCA) based on the idea of peer prediction to evaluate user contribution in FL without a test dataset.
We then apply PCA to designing (1) a new federated learning algorithm called Fed-PCA, and (2) a new incentive mechanism that guarantees truthfulness.
- Score: 31.181141140071592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) trains a machine learning model on mobile devices in
a distributed manner using each device's private data and computing resources.
A critical issues is to evaluate individual users' contributions so that (1)
users' effort in model training can be compensated with proper incentives and
(2) malicious and low-quality users can be detected and removed. The
state-of-the-art solutions require a representative test dataset for the
evaluation purpose, but such a dataset is often unavailable and hard to
synthesize. In this paper, we propose a method called Pairwise Correlated
Agreement (PCA) based on the idea of peer prediction to evaluate user
contribution in FL without a test dataset. PCA achieves this using the
statistical correlation of the model parameters uploaded by users. We then
apply PCA to designing (1) a new federated learning algorithm called Fed-PCA,
and (2) a new incentive mechanism that guarantees truthfulness. We evaluate the
performance of PCA and Fed-PCA using the MNIST dataset and a large industrial
product recommendation dataset. The results demonstrate that our Fed-PCA
outperforms the canonical FedAvg algorithm and other baseline methods in
accuracy, and at the same time, PCA effectively incentivizes users to behave
truthfully.
Related papers
- Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a method called Stratified Prediction-Powered Inference (StratPPI)
We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Bayesian Prediction-Powered Inference [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a framework for PPI based on Bayesian inference that allows researchers to develop new task-appropriate PPI methods easily.
arXiv Detail & Related papers (2024-05-09T18:08:58Z) - Fair Streaming Principal Component Analysis: Statistical and Algorithmic
Viewpoint [38.86637435197192]
We present a new theoretical and practical approach to fair Principal Component Analysis (PCA) using a new notion called emphprobably approximately fair and optimal (PAFO) learnability.
We then provide its it statistical guarantee in terms of PAFO-learnability, which is the first of its kind in fair PCA literature.
Lastly, we verify the efficacy and memory efficiency of our algorithm on real-world datasets.
arXiv Detail & Related papers (2023-10-28T05:09:30Z) - Sample Complexity of Preference-Based Nonparametric Off-Policy
Evaluation with Deep Networks [58.469818546042696]
We study the sample efficiency of OPE with human preference and establish a statistical guarantee for it.
By appropriately selecting the size of a ReLU network, we show that one can leverage any low-dimensional manifold structure in the Markov decision process.
arXiv Detail & Related papers (2023-10-16T16:27:06Z) - Uncertainty-Aware Instance Reweighting for Off-Policy Learning [63.31923483172859]
We propose a Uncertainty-aware Inverse Propensity Score estimator (UIPS) for improved off-policy learning.
Experiment results on synthetic and three real-world recommendation datasets demonstrate the advantageous sample efficiency of the proposed UIPS estimator.
arXiv Detail & Related papers (2023-03-11T11:42:26Z) - Efficient fair PCA for fair representation learning [21.990310743597174]
We propose a conceptually simple approach that allows for an analytic solution similar to standard PCA and can be kernelized.
Our methods have the same complexity as standard PCA, or kernel PCA, and run much faster than existing methods for fair PCA based on semidefinite programming or manifold optimization.
arXiv Detail & Related papers (2023-02-26T13:34:43Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component
Analysis [12.91948651812873]
Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning.
This paper proposes a distributed PCA algorithm called FAST-PCA (Fast and exAct diSTributed PCA)
arXiv Detail & Related papers (2021-08-27T16:10:59Z) - Federated Robustness Propagation: Sharing Adversarial Robustness in
Federated Learning [98.05061014090913]
Federated learning (FL) emerges as a popular distributed learning schema that learns from a set of participating users without requiring raw data to be shared.
adversarial training (AT) provides a sound solution for centralized learning, extending its usage for FL users has imposed significant challenges.
We show that existing FL techniques cannot effectively propagate adversarial robustness among non-iid users.
We propose a simple yet effective propagation approach that transfers robustness through carefully designed batch-normalization statistics.
arXiv Detail & Related papers (2021-06-18T15:52:33Z) - Estimation of Individual Device Contributions for Incentivizing
Federated Learning [8.426678774799859]
Federated learning (FL) is an emerging technique used to train a machine-learning model collaboratively using the data and computation resource of mobile devices.
This paper proposes a computation-and communication-efficient method of estimating a participating device's contribution level.
arXiv Detail & Related papers (2020-09-20T07:03:27Z) - Privacy Preserving PCA for Multiparty Modeling [21.33430578478244]
PPPCA can accomplish multiparty cooperative execution of PCA under the premise of keeping plaintext data locally.
The output of PPPCA can be sent directly to data consumer to build any machine learning models.
Results show that the accuracy of the model built upon PPPCA is the same as the model with PCA that is built based on centralized data.
arXiv Detail & Related papers (2020-02-06T04:16:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.