OmniLytics+: A Secure, Efficient, and Affordable Blockchain Data Market for Machine Learning through Off-Chain Processing
- URL: http://arxiv.org/abs/2406.06477v1
- Date: Wed, 17 Apr 2024 14:41:14 GMT
- Title: OmniLytics+: A Secure, Efficient, and Affordable Blockchain Data Market for Machine Learning through Off-Chain Processing
- Authors: Songze Li, Mingzhe Liu, Mengqi Chen,
- Abstract summary: We propose OmniLytics+, the first decentralized data market built upon blockchain and smart contract technologies.
The storage and processing overheads are securely offloaded from blockchain verifiers.
Experiments demonstrate the effectiveness of OmniLytics+ in training large ML models in presence of malicious data owner.
- Score: 10.055818984984
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The rapid development of large machine learning (ML) models requires a massive amount of training data, resulting in booming demands of data sharing and trading through data markets. Traditional centralized data markets suffer from low level of security, and emerging decentralized platforms are faced with efficiency and privacy challenges. In this paper, we propose OmniLytics+, the first decentralized data market, built upon blockchain and smart contract technologies, to simultaneously achieve 1) data (resp., model) privacy for the data (resp. model) owner; 2) robustness against malicious data owners; 3) efficient data validation and aggregation. Specifically, adopting the zero-knowledge (ZK) rollup paradigm, OmniLytics+ proposes to secret share encrypted local gradients, computed from the encrypted global model, with a set of untrusted off-chain servers, who collaboratively generate a ZK proof on the validity of the gradient. In this way, the storage and processing overheads are securely offloaded from blockchain verifiers, significantly improving the privacy, efficiency, and affordability over existing rollup solutions. We implement the proposed OmniLytics+ data market as an Ethereum smart contract [41]. Extensive experiments demonstrate the effectiveness of OmniLytics+ in training large ML models in presence of malicious data owner, and the substantial advantages of OmniLytics+ in gas cost and execution time over baselines.
Related papers
- Enhancing Trust and Privacy in Distributed Networks: A Comprehensive Survey on Blockchain-based Federated Learning [51.13534069758711]
Decentralized approaches like blockchain offer a compelling solution by implementing a consensus mechanism among multiple entities.
Federated Learning (FL) enables participants to collaboratively train models while safeguarding data privacy.
This paper investigates the synergy between blockchain's security features and FL's privacy-preserving model training capabilities.
arXiv Detail & Related papers (2024-03-28T07:08:26Z) - zkDFL: An efficient and privacy-preserving decentralized federated
learning with zero-knowledge proof [3.517233208696287]
Federated learning (FL) has been widely adopted in various fields of study and business.
Traditional centralized FL systems suffer from serious issues.
We propose a zero-knowledge proof (ZKP)-based aggregator (zkDFL)
arXiv Detail & Related papers (2023-12-01T17:00:30Z) - A Blockchain Solution for Collaborative Machine Learning over IoT [0.31410859223862103]
Federated learning (FL) and blockchain technologies have emerged as promising approaches to address these challenges.
We present a novel IoT solution that combines the incremental learning vector quantization algorithm (XuILVQ) with blockchain technology.
Our proposed architecture addresses the shortcomings of existing blockchain-based FL solutions by reducing computational and communication overheads while maintaining data privacy and security.
arXiv Detail & Related papers (2023-11-23T18:06:05Z) - Blockchain-empowered Federated Learning for Healthcare Metaverses:
User-centric Incentive Mechanism with Optimal Data Freshness [66.3982155172418]
We first design a user-centric privacy-preserving framework based on decentralized Federated Learning (FL) for healthcare metaverses.
We then utilize Age of Information (AoI) as an effective data-freshness metric and propose an AoI-based contract theory model under Prospect Theory (PT) to motivate sensing data sharing.
arXiv Detail & Related papers (2023-07-29T12:54:03Z) - Blockchain-Based Federated Learning: Incentivizing Data Sharing and
Penalizing Dishonest Behavior [0.0]
This paper proposes a comprehensive framework that integrates data trust in federated learning with InterPlanetary File System, blockchain, and smart contracts.
The proposed model is effective in improving the accuracy of federated learning models while ensuring the security and fairness of the data-sharing process.
The research paper also presents a decentralized federated learning platform that successfully trained a CNN model on the MNIST dataset.
arXiv Detail & Related papers (2023-07-19T23:05:49Z) - Dynamic Datasets and Market Environments for Financial Reinforcement
Learning [68.11692837240756]
FinRL-Meta is a library that processes dynamic datasets from real-world markets into gym-style market environments.
We provide examples and reproduce popular research papers as stepping stones for users to design new trading strategies.
We also deploy the library on cloud platforms so that users can visualize their own results and assess the relative performance.
arXiv Detail & Related papers (2023-04-25T22:17:31Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - APPFLChain: A Privacy Protection Distributed Artificial-Intelligence
Architecture Based on Federated Learning and Consortium Blockchain [6.054775780656853]
We propose a new system architecture called APPFLChain.
It is an integrated architecture of a Hyperledger Fabric-based blockchain and a federated-learning paradigm.
Our new system can maintain a high degree of security and privacy as users do not need to share sensitive personal information to the server.
arXiv Detail & Related papers (2022-06-26T05:30:07Z) - OmniLytics: A Blockchain-based Secure Data Market for Decentralized
Machine Learning [3.9256804549871553]
We propose OmniLytics, a secure data trading marketplace for machine learning applications.
Data owners can contribute their private data to collectively train a ML model requested by some model owners, and get compensated for data contribution.
OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against curious model and data owners; 3) resilience to malicious data owners who provide faulty results to poison model training; and 4) resilience to malicious model owner who intents to evade the payment.
arXiv Detail & Related papers (2021-07-12T08:28:15Z) - Blockchain Assisted Decentralized Federated Learning (BLADE-FL) with
Lazy Clients [124.48732110742623]
We propose a novel framework by integrating blockchain into Federated Learning (FL)
BLADE-FL has a good performance in terms of privacy preservation, tamper resistance, and effective cooperation of learning.
It gives rise to a new problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to conceal their cheating behaviors.
arXiv Detail & Related papers (2020-12-02T12:18:27Z) - Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.
We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem.
We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.