Improving Fairness for Data Valuation in Federated Learning
- URL: http://arxiv.org/abs/2109.09046v1
- Date: Sun, 19 Sep 2021 02:39:59 GMT
- Title: Improving Fairness for Data Valuation in Federated Learning
- Authors: Zhenan Fan, Huang Fang, Zirui Zhou, Jian Pei, Michael P. Friedlander,
Changxin Liu, Yong Zhang
- Abstract summary: We propose a new measure called completed federated Shapley value to improve the fairness of federated Shapley value.
It is shown under mild conditions that this matrix is approximately low-rank by leveraging concepts and tools from optimization.
- Score: 39.61504568047234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning is an emerging decentralized machine learning scheme that
allows multiple data owners to work collaboratively while ensuring data
privacy. The success of federated learning depends largely on the participation
of data owners. To sustain and encourage data owners' participation, it is
crucial to fairly evaluate the quality of the data provided by the data owners
and reward them correspondingly. Federated Shapley value, recently proposed by
Wang et al. [Federated Learning, 2020], is a measure for data value under the
framework of federated learning that satisfies many desired properties for data
valuation. However, there are still factors of potential unfairness in the
design of federated Shapley value because two data owners with the same local
data may not receive the same evaluation. We propose a new measure called
completed federated Shapley value to improve the fairness of federated Shapley
value. The design depends on completing a matrix consisting of all the possible
contributions by different subsets of the data owners. It is shown under mild
conditions that this matrix is approximately low-rank by leveraging concepts
and tools from optimization. Both theoretical analysis and empirical evaluation
verify that the proposed measure does improve fairness in many circumstances.
Related papers
- Data Valuation and Detections in Federated Learning [4.899818550820576]
Federated Learning (FL) enables collaborative model training while preserving the privacy of raw data.
A challenge in this framework is the fair and efficient valuation of data, which is crucial for incentivizing clients to contribute high-quality data in the FL task.
This paper introduces a novel privacy-preserving method for evaluating client contributions and selecting relevant datasets without a pre-specified training algorithm in an FL task.
arXiv Detail & Related papers (2023-11-09T12:01:32Z) - Shapley Value on Probabilistic Classifiers [6.163093930860032]
In the context of machine learning (ML), data valuation methods aim to equitably measure the contribution of each data point to the utility of an ML model.
Traditional Shapley-based data valuation methods may not effectively distinguish between beneficial and detrimental training data points.
We propose Probabilistic Shapley (P-Shapley) value by constructing a probability-wise utility function.
arXiv Detail & Related papers (2023-06-12T15:09:13Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Fair and efficient contribution valuation for vertical federated
learning [49.50442779626123]
Federated learning is a popular technology for training machine learning models on distributed data sources without sharing data.
The Shapley value (SV) is a provably fair contribution valuation metric originated from cooperative game theory.
We propose a contribution valuation metric called vertical federated Shapley value (VerFedSV) based on SV.
arXiv Detail & Related papers (2022-01-07T19:57:15Z) - GTG-Shapley: Efficient and Accurate Participant Contribution Evaluation
in Federated Learning [25.44023017628766]
Federated Learning (FL) bridges the gap between collaborative machine learning and preserving data privacy.
It is essential to fairly evaluate participants' contribution to the performance of the final FL model without exposing their private data.
We propose the Guided Truncation Gradient Shapley approach to address this challenge.
arXiv Detail & Related papers (2021-09-05T12:17:00Z) - Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process.
We generate a representative as well as fair version of the UCI Adult census data set.
We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z) - A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources.
The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion.
This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z) - Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value.
The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value.
We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.