Analysis of Models for Decentralized and Collaborative AI on Blockchain
- URL: http://arxiv.org/abs/2009.06756v2
- Date: Tue, 22 Sep 2020 03:14:47 GMT
- Title: Analysis of Models for Decentralized and Collaborative AI on Blockchain
- Authors: Justin D. Harris
- Abstract summary: We evaluate the use of several models and configurations in order to propose best practices when using the Self-Assessment incentive mechanism.
We compare several factors for each dataset when models are hosted in smart contracts on a public blockchain.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has recently enabled large advances in artificial
intelligence, but these results can be highly centralized. The large datasets
required are generally proprietary; predictions are often sold on a per-query
basis; and published models can quickly become out of date without effort to
acquire more data and maintain them. Published proposals to provide models and
data for free for certain tasks include Microsoft Research's Decentralized and
Collaborative AI on Blockchain. The framework allows participants to
collaboratively build a dataset and use smart contracts to share a continuously
updated model on a public blockchain. The initial proposal gave an overview of
the framework omitting many details of the models used and the incentive
mechanisms in real world scenarios. In this work, we evaluate the use of
several models and configurations in order to propose best practices when using
the Self-Assessment incentive mechanism so that models can remain accurate and
well-intended participants that submit correct data have the chance to profit.
We have analyzed simulations for each of three models: Perceptron, Na\"ive
Bayes, and a Nearest Centroid Classifier, with three different datasets:
predicting a sport with user activity from Endomondo, sentiment analysis on
movie reviews from IMDB, and determining if a news article is fake. We compare
several factors for each dataset when models are hosted in smart contracts on a
public blockchain: their accuracy over time, balances of a good and bad user,
and transaction costs (or gas) for deploying, updating, collecting refunds, and
collecting rewards. A free and open source implementation for the Ethereum
blockchain and simulations written in Python is provided at
https://github.com/microsoft/0xDeCA10B. This version has updated gas costs
using newer optimizations written after the original publication.
Related papers
- Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models [146.18107944503436]
Molmo is a new family of VLMs that are state-of-the-art in their class of openness.
Our key innovation is a novel, highly detailed image caption dataset collected entirely from human annotators.
We will be releasing all of our model weights, captioning and fine-tuning data, and source code in the near future.
arXiv Detail & Related papers (2024-09-25T17:59:51Z) - Blockchain-based Federated Learning with Secure Aggregation in Trusted
Execution Environment for Internet-of-Things [20.797220195954065]
This paper proposes a blockchain-based Federated Learning (FL) framework with Intel Software Guard Extension (SGX)-based Trusted Execution Environment (TEE) to securely aggregate local models in Industrial Internet-of-Things (IIoTs)
In FL, local models can be tampered with by attackers. Hence, a global model generated from the tampered local models can be erroneous. Therefore, the proposed framework leverages a blockchain network for secure model aggregation.
nodes can verify the authenticity of the aggregated model, run a blockchain consensus mechanism to ensure the integrity of the model, and add it to the distributed ledger for tamper-proof storage.
arXiv Detail & Related papers (2023-04-25T15:00:39Z) - Proof-of-Contribution-Based Design for Collaborative Machine Learning on
Blockchain [23.641069086247573]
Our goal is to design a data marketplace for such decentralized collaborative/federated learning applications.
In our design, we utilize a distributed storage infrastructure and an aggregator aside from the project owner and the trainers.
We execute the proposed data market through a blockchain smart contract.
arXiv Detail & Related papers (2023-02-27T18:43:11Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Mechanisms that Incentivize Data Sharing in Federated Learning [90.74337749137432]
We show how a naive scheme leads to catastrophic levels of free-riding where the benefits of data sharing are completely eroded.
We then introduce accuracy shaping based mechanisms to maximize the amount of data generated by each agent.
arXiv Detail & Related papers (2022-07-10T22:36:52Z) - APPFLChain: A Privacy Protection Distributed Artificial-Intelligence
Architecture Based on Federated Learning and Consortium Blockchain [6.054775780656853]
We propose a new system architecture called APPFLChain.
It is an integrated architecture of a Hyperledger Fabric-based blockchain and a federated-learning paradigm.
Our new system can maintain a high degree of security and privacy as users do not need to share sensitive personal information to the server.
arXiv Detail & Related papers (2022-06-26T05:30:07Z) - OmniLytics: A Blockchain-based Secure Data Market for Decentralized
Machine Learning [3.9256804549871553]
We propose OmniLytics, a secure data trading marketplace for machine learning applications.
Data owners can contribute their private data to collectively train a ML model requested by some model owners, and get compensated for data contribution.
OmniLytics enables such model training while simultaneously providing 1) model security against curious data owners; 2) data security against curious model and data owners; 3) resilience to malicious data owners who provide faulty results to poison model training; and 4) resilience to malicious model owner who intents to evade the payment.
arXiv Detail & Related papers (2021-07-12T08:28:15Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - 2CP: Decentralized Protocols to Transparently Evaluate Contributivity in
Blockchain Federated Learning Environments [9.885896204530878]
We introduce 2CP, a framework comprising two novel protocols for Federated Learning.
Crowdsource Protocol allows an actor to bring a model forward for training, and use their own data to evaluate the contributions made to it.
The Consortium Protocol gives trainers the same guarantee even when no party owns the initial model and no dataset is available.
arXiv Detail & Related papers (2020-11-15T12:59:56Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering.
The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch.
The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level.
The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.