Proof-of-Contribution-Based Design for Collaborative Machine Learning on
Blockchain
- URL: http://arxiv.org/abs/2302.14031v1
- Date: Mon, 27 Feb 2023 18:43:11 GMT
- Title: Proof-of-Contribution-Based Design for Collaborative Machine Learning on
Blockchain
- Authors: Baturalp Buyukates and Chaoyang He and Shanshan Han and Zhiyong Fang
and Yupeng Zhang and Jieyi Long and Ali Farahanchi and Salman Avestimehr
- Abstract summary: Our goal is to design a data marketplace for such decentralized collaborative/federated learning applications.
In our design, we utilize a distributed storage infrastructure and an aggregator aside from the project owner and the trainers.
We execute the proposed data market through a blockchain smart contract.
- Score: 23.641069086247573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a project (model) owner that would like to train a model by
utilizing the local private data and compute power of interested data owners,
i.e., trainers. Our goal is to design a data marketplace for such decentralized
collaborative/federated learning applications that simultaneously provides i)
proof-of-contribution based reward allocation so that the trainers are
compensated based on their contributions to the trained model; ii)
privacy-preserving decentralized model training by avoiding any data movement
from data owners; iii) robustness against malicious parties (e.g., trainers
aiming to poison the model); iv) verifiability in the sense that the integrity,
i.e., correctness, of all computations in the data market protocol including
contribution assessment and outlier detection are verifiable through
zero-knowledge proofs; and v) efficient and universal design. We propose a
blockchain-based marketplace design to achieve all five objectives mentioned
above. In our design, we utilize a distributed storage infrastructure and an
aggregator aside from the project owner and the trainers. The aggregator is a
processing node that performs certain computations, including assessing trainer
contributions, removing outliers, and updating hyper-parameters. We execute the
proposed data market through a blockchain smart contract. The deployed smart
contract ensures that the project owner cannot evade payment, and honest
trainers are rewarded based on their contributions at the end of training.
Finally, we implement the building blocks of the proposed data market and
demonstrate their applicability in practical scenarios through extensive
experiments.
Related papers
- Research on Data Right Confirmation Mechanism of Federated Learning based on Blockchain [0.069060054915724]
Federated learning can solve the privacy protection problem in distributed data mining and machine learning.
This paper proposes a data ownership confirmation mechanism based on blockchain and smart contract.
arXiv Detail & Related papers (2024-09-13T02:02:18Z) - Enhancing Trust and Privacy in Distributed Networks: A Comprehensive Survey on Blockchain-based Federated Learning [51.13534069758711]
Decentralized approaches like blockchain offer a compelling solution by implementing a consensus mechanism among multiple entities.
Federated Learning (FL) enables participants to collaboratively train models while safeguarding data privacy.
This paper investigates the synergy between blockchain's security features and FL's privacy-preserving model training capabilities.
arXiv Detail & Related papers (2024-03-28T07:08:26Z) - Blockchain-enabled Trustworthy Federated Unlearning [50.01101423318312]
Federated unlearning is a promising paradigm for protecting the data ownership of distributed clients.
Existing works require central servers to retain the historical model parameters from distributed clients.
This paper proposes a new blockchain-enabled trustworthy federated unlearning framework.
arXiv Detail & Related papers (2024-01-29T07:04:48Z) - Semantic Information Marketing in The Metaverse: A Learning-Based
Contract Theory Framework [68.8725783112254]
We address the problem of designing incentive mechanisms by a virtual service provider (VSP) to hire sensing IoT devices to sell their sensing data.
Due to the limited bandwidth, we propose to use semantic extraction algorithms to reduce the delivered data by the sensing IoT devices.
We propose a novel iterative contract design and use a new variant of multi-agent reinforcement learning (MARL) to solve the modelled multi-dimensional contract problem.
arXiv Detail & Related papers (2023-02-22T15:52:37Z) - Mechanisms that Incentivize Data Sharing in Federated Learning [90.74337749137432]
We show how a naive scheme leads to catastrophic levels of free-riding where the benefits of data sharing are completely eroded.
We then introduce accuracy shaping based mechanisms to maximize the amount of data generated by each agent.
arXiv Detail & Related papers (2022-07-10T22:36:52Z) - Leveraging Centric Data Federated Learning Using Blockchain For
Integrity Assurance [14.347917009290814]
We propose a data-centric federated learning architecture leveraged by a public blockchain and smart contracts.
Our proposed solution provides a virtual public marketplace where developers, data scientists, and AI-engineer can publish their models.
We enhance data quality and integrity through an incentive mechanism that rewards contributors for data contribution and verification.
arXiv Detail & Related papers (2022-06-09T19:06:05Z) - OmniLytics: A Blockchain-based Secure Data Market for Decentralized
Machine Learning [3.9256804549871553]
We propose OmniLytics, a secure data trading marketplace for machine learning applications.
Data owners can contribute their private data to collectively train a ML model requested by some model owners, and get compensated for data contribution.
OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against curious model and data owners; 3) resilience to malicious data owners who provide faulty results to poison model training; and 4) resilience to malicious model owner who intents to evade the payment.
arXiv Detail & Related papers (2021-07-12T08:28:15Z) - Decentralized Federated Learning Preserves Model and Data Privacy [77.454688257702]
We propose a fully decentralized approach, which allows to share knowledge between trained models.
Students are trained on the output of their teachers via synthetically generated input data.
The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher.
arXiv Detail & Related papers (2021-02-01T14:38:54Z) - 2CP: Decentralized Protocols to Transparently Evaluate Contributivity in
Blockchain Federated Learning Environments [9.885896204530878]
We introduce 2CP, a framework comprising two novel protocols for Federated Learning.
Crowdsource Protocol allows an actor to bring a model forward for training, and use their own data to evaluate the contributions made to it.
The Consortium Protocol gives trainers the same guarantee even when no party owns the initial model and no dataset is available.
arXiv Detail & Related papers (2020-11-15T12:59:56Z) - Analysis of Models for Decentralized and Collaborative AI on Blockchain [0.0]
We evaluate the use of several models and configurations in order to propose best practices when using the Self-Assessment incentive mechanism.
We compare several factors for each dataset when models are hosted in smart contracts on a public blockchain.
arXiv Detail & Related papers (2020-09-14T21:38:55Z) - A Principled Approach to Data Valuation for Federated Learning [73.19984041333599]
Federated learning (FL) is a popular technique to train machine learning (ML) models on decentralized data sources.
The Shapley value (SV) defines a unique payoff scheme that satisfies many desiderata for a data value notion.
This paper proposes a variant of the SV amenable to FL, which we call the federated Shapley value.
arXiv Detail & Related papers (2020-09-14T04:37:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.