Data Sharing Markets
- URL: http://arxiv.org/abs/2107.08630v2
- Date: Tue, 20 Jul 2021 06:31:23 GMT
- Title: Data Sharing Markets
- Authors: Mohammad Rasouli, Michael I. Jordan
- Abstract summary: We study a setup where each agent can be both buyer and seller of data.
We consider two cases: bilateral data exchange (trading data with data) and unilateral data exchange (trading data with money)
- Score: 95.13209326119153
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: With the growing use of distributed machine learning techniques, there is a
growing need for data markets that allows agents to share data with each other.
Nevertheless data has unique features that separates it from other commodities
including replicability, cost of sharing, and ability to distort. We study a
setup where each agent can be both buyer and seller of data. For this setup, we
consider two cases: bilateral data exchange (trading data with data) and
unilateral data exchange (trading data with money). We model bilateral sharing
as a network formation game and show the existence of strongly stable outcome
under the top agents property by allowing limited complementarity. We propose
ordered match algorithm which can find the stable outcome in O(N^2) (N is the
number of agents). For the unilateral sharing, under the assumption of additive
cost structure, we construct competitive prices that can implement any social
welfare maximizing outcome. Finally for this setup when agents have private
information, we propose mixed-VCG mechanism which uses zero cost data
distortion of data sharing with its isolated impact to achieve budget balance
while truthfully implementing socially optimal outcomes to the exact level of
budget imbalance of standard VCG mechanisms. Mixed-VCG uses data distortions as
data money for this purpose. We further relax zero cost data distortion
assumption by proposing distorted-mixed-VCG. We also extend our model and
results to data sharing via incremental inquiries and differential privacy
costs.
Related papers
- An Instrumental Value for Data Production and its Application to Data Pricing [107.98697414652479]
This paper develops an approach for capturing the instrumental value of data production processes.
We show how they connect to classic notions of information design and signals in information economics.
arXiv Detail & Related papers (2024-12-24T03:53:57Z) - Wasserstein Markets for Differentially-Private Data [1.4266656344673316]
Data markets provide a means to enable wider access as well as determine the appropriate privacy-utility trade-off.
Existing data market frameworks either require a trusted third party to perform expensive valuations or are unable to capture the nature of data value.
This paper proposes a valuation mechanism based on the Wasserstein distance for differentially-private data, and corresponding procurement mechanisms.
arXiv Detail & Related papers (2024-12-03T17:40:26Z) - Incentives in Private Collaborative Machine Learning [56.84263918489519]
Collaborative machine learning involves training models on data from multiple parties.
We introduce differential privacy (DP) as an incentive.
We empirically demonstrate the effectiveness and practicality of our approach on synthetic and real-world datasets.
arXiv Detail & Related papers (2024-04-02T06:28:22Z) - CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources [5.898893619901382]
We propose a framework for the collaborative and private generation of synthetic data from distributed data holders.
We replace the trusted aggregator with secure multi-party computation protocols and output privacy via differential privacy (DP)
We demonstrate the applicability and scalability of our approach for the state-of-the-art select-measure-generate algorithms MWEM+PGM and AIM.
arXiv Detail & Related papers (2024-02-13T17:26:32Z) - Mechanisms that Incentivize Data Sharing in Federated Learning [90.74337749137432]
We show how a naive scheme leads to catastrophic levels of free-riding where the benefits of data sharing are completely eroded.
We then introduce accuracy shaping based mechanisms to maximize the amount of data generated by each agent.
arXiv Detail & Related papers (2022-07-10T22:36:52Z) - Strategic Coalition for Data Pricing in IoT Data Markets [32.38170282930876]
This paper considers a market for trading Internet of Things (IoT) data that is used to train machine learning models.
The data is supplied to the market platform through a network and the price of such data is controlled based on the value it brings to the machine learning model.
arXiv Detail & Related papers (2022-06-15T19:48:10Z) - VFed-SSD: Towards Practical Vertical Federated Advertising [53.08038962443853]
We propose a semi-supervised split distillation framework VFed-SSD to alleviate the two limitations.
Specifically, we develop a self-supervised task MatchedPair Detection (MPD) to exploit the vertically partitioned unlabeled data.
Our framework provides an efficient federation-enhanced solution for real-time display advertising with minimal deploying cost and significant performance lift.
arXiv Detail & Related papers (2022-05-31T17:45:30Z) - Spending Privacy Budget Fairly and Wisely [7.975975942400017]
Differentially private (DP) synthetic data generation is a practical method for improving access to data.
One issue inherent to DP is that the "privacy budget" is generally "spent" evenly across features in the data set.
We develop ensemble methods that distribute the privacy budget "wisely" to maximize predictive accuracy of models trained on DP data.
arXiv Detail & Related papers (2022-04-27T13:13:56Z) - Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process.
We generate a representative as well as fair version of the UCI Adult census data set.
We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.