Game of Coding: Sybil Resistant Decentralized Machine Learning with Minimal Trust Assumption
- URL: http://arxiv.org/abs/2410.05540v2
- Date: Thu, 17 Oct 2024 01:35:34 GMT
- Title: Game of Coding: Sybil Resistant Decentralized Machine Learning with Minimal Trust Assumption
- Authors: Hanzaleh Akbari Nodehi, Viveck R. Cadambe, Mohammad Ali Maddah-Ali,
- Abstract summary: This paper investigates the implications of increasing the number of nodes in the game of coding framework.
We show that despite the increased flexibility for the adversary with an increasing number of adversarial nodes, having more power is not beneficial for the adversary.
- Score: 20.564198591600647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Coding theory plays a crucial role in ensuring data integrity and reliability across various domains, from communication to computation and storage systems. However, its reliance on trust assumptions for data recovery poses significant challenges, particularly in emerging decentralized systems where trust is scarce. To address this, the game of coding framework was introduced, offering insights into strategies for data recovery within incentive-oriented environments. The focus of the earliest version of the game of coding was limited to scenarios involving only two nodes. This paper investigates the implications of increasing the number of nodes in the game of coding framework, particularly focusing on scenarios with one honest node and multiple adversarial nodes. We demonstrate that despite the increased flexibility for the adversary with an increasing number of adversarial nodes, having more power is not beneficial for the adversary and is not detrimental to the data collector, making this scheme sybil-resistant. Furthermore, we outline optimal strategies for the data collector in terms of accepting or rejecting the inputs, and characterize the optimal noise distribution for the adversary.
Related papers
- A Secure and Private Distributed Bayesian Federated Learning Design [56.92336577799572]
Distributed Federated Learning (DFL) enables decentralized model training across large-scale systems without a central parameter server.<n>DFL faces three critical challenges: privacy leakage from honest-but-curious neighbors, slow convergence due to the lack of central coordination, and vulnerability to Byzantine adversaries aiming to degrade model accuracy.<n>We propose a novel DFL framework that integrates Byzantine robustness, privacy preservation, and convergence acceleration.
arXiv Detail & Related papers (2026-02-23T16:12:02Z) - Improving Detection of Rare Nodes in Hierarchical Multi-Label Learning [1.4213292010741236]
We propose a weighted loss objective for neural networks that combines node-wise imbalance weighting with focal weighting components.<n>We observe improvements in recall by up to a factor of five on benchmark datasets, along with statistically significant gains in $F_1$ score.
arXiv Detail & Related papers (2026-02-09T18:34:17Z) - Game of Coding: Coding Theory in the Presence of Rational Adversaries, Motivated by Decentralized Machine Learning [16.147310961390534]
Coding theory plays a crucial role in enabling reliable communication, storage, and computation.<n>In some emerging decentralized applications, particularly in decentralized machine learning (DeML), participating nodes are rewarded for accepted contributions.<n>We introduce the game of coding, a novel game-theoretic framework that extends coding theory to trust-minimized settings.
arXiv Detail & Related papers (2026-01-05T18:04:32Z) - FRAG: Toward Federated Vector Database Management for Collaborative and Secure Retrieval-Augmented Generation [1.3824176915623292]
This paper introduces textitFederated Retrieval-Augmented Generation (FRAG), a novel database management paradigm tailored for the growing needs of retrieval-augmented generation (RAG) systems.
FRAG enables mutually-distrusted parties to collaboratively perform Approximate $k$-Nearest Neighbor (ANN) searches on encrypted query vectors and encrypted data stored in distributed vector databases.
arXiv Detail & Related papers (2024-10-17T06:57:29Z) - Decentralized Federated Anomaly Detection in Smart Grids: A P2P Gossip Approach [0.44328715570014865]
This paper introduces a novel decentralized federated anomaly detection scheme based on two main gossip protocols namely Random Walk and Epidemic.
Our approach yields a notable 35% improvement in training time compared to conventional Federated Learning.
arXiv Detail & Related papers (2024-07-20T10:45:06Z) - Privacy-Preserving Distributed Learning for Residential Short-Term Load
Forecasting [11.185176107646956]
Power system load data can inadvertently reveal the daily routines of residential users, posing a risk to their property security.
We introduce a Markovian Switching-based distributed training framework, the convergence of which is substantiated through rigorous theoretical analysis.
Case studies employing real-world power system load data validate the efficacy of our proposed algorithm.
arXiv Detail & Related papers (2024-02-02T16:39:08Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Unsupervised Finetuning [80.58625921631506]
We propose two strategies to combine source and target data into unsupervised finetuning.
The motivation of the former strategy is to add a small portion of source data back to occupy their pretrained representation space.
The motivation of the latter strategy is to increase the data density and help learn more compact representation.
arXiv Detail & Related papers (2021-10-18T17:57:05Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - Robust Machine Learning via Privacy/Rate-Distortion Theory [34.28921458311185]
Robust machine learning formulations have emerged to address the prevalent vulnerability of deep neural networks to adversarial examples.
Our work draws the connection between optimal robust learning and the privacy-utility tradeoff problem, which is a generalization of the rate-distortion problem.
This information-theoretic perspective sheds light on the fundamental tradeoff between robustness and clean data performance.
arXiv Detail & Related papers (2020-07-22T21:34:59Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.