DICE: Data Influence Cascade in Decentralized Learning
- URL: http://arxiv.org/abs/2507.06931v1
- Date: Wed, 09 Jul 2025 15:13:44 GMT
- Title: DICE: Data Influence Cascade in Decentralized Learning
- Authors: Tongtian Zhu, Wenhao Li, Can Wang, Fengxiang He,
- Abstract summary: We develop a framework to estimate textbfData textbfInfluence textbfCascadtextbfE (DICE) in a decentralized environment.<n>DICE lays the foundations for applications including selecting suitable collaborators and identifying malicious behaviors.
- Score: 40.90617253486055
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decentralized learning offers a promising approach to crowdsource data consumptions and computational workloads across geographically distributed compute interconnected through peer-to-peer networks, accommodating the exponentially increasing demands. However, proper incentives are still in absence, considerably discouraging participation. Our vision is that a fair incentive mechanism relies on fair attribution of contributions to participating nodes, which faces non-trivial challenges arising from the localized connections making influence ``cascade'' in a decentralized network. To overcome this, we design the first method to estimate \textbf{D}ata \textbf{I}nfluence \textbf{C}ascad\textbf{E} (DICE) in a decentralized environment. Theoretically, the framework derives tractable approximations of influence cascade over arbitrary neighbor hops, suggesting the influence cascade is determined by an interplay of data, communication topology, and the curvature of loss landscape. DICE also lays the foundations for applications including selecting suitable collaborators and identifying malicious behaviors. Project page is available at https://raiden-zhu.github.io/blog/2025/DICE/.
Related papers
- Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - Privacy Preserving Semi-Decentralized Mean Estimation over Intermittently-Connected Networks [59.43433767253956]
We consider the problem of privately estimating the mean of vectors distributed across different nodes of an unreliable wireless network.
In a semi-decentralized setup, nodes can collaborate with their neighbors to compute a local consensus, which they relay to a central server.
We study the tradeoff between collaborative relaying and privacy leakage due to the data sharing among nodes.
arXiv Detail & Related papers (2024-06-06T06:12:15Z) - Decentralized Directed Collaboration for Personalized Federated Learning [39.29794569421094]
We concentrate on the Decentralized Personalized Learning (DPFL) that performs distributed training model computation.
We propose a directed collaboration framework by incorporating textbfDecentralized textbfFederated textbfPartial textbfGradient textbfPedGP.
arXiv Detail & Related papers (2024-05-28T06:52:19Z) - Impact of network topology on the performance of Decentralized Federated
Learning [4.618221836001186]
Decentralized machine learning is gaining momentum, addressing infrastructure challenges and privacy concerns.
This study investigates the interplay between network structure and learning performance using three network topologies and six data distribution methods.
We highlight the challenges in transferring knowledge from peripheral to central nodes, attributed to a dilution effect during model aggregation.
arXiv Detail & Related papers (2024-02-28T11:13:53Z) - DSCom: A Data-Driven Self-Adaptive Community-Based Framework for
Influence Maximization in Social Networks [3.97535858363999]
We reformulate the problem on the attributed network and leverage the node attributes to estimate the closeness between connected nodes.
Specifically, we propose a machine learning-based framework, named DSCom, to address this problem.
Compared to the previous theoretical works, we carefully designed empirical experiments with parameterized diffusion models based on real-world social networks.
arXiv Detail & Related papers (2023-11-18T14:03:43Z) - Sparse Decentralized Federated Learning [35.32297764027417]
Decentralized Federated Learning (DFL) enables collaborative model training without a central server but faces challenges in efficiency, stability, and trustworthiness.<n>We introduce a sparsity constraint on the shared model, leading to Sparse DFL (SDFL), and propose a novel algorithm, CEPS.<n> Numerical experiments validate the effectiveness of the proposed algorithm in improving communication and efficiency while maintaining a high level of trustworthiness.
arXiv Detail & Related papers (2023-08-31T12:22:40Z) - CAFIN: Centrality Aware Fairness inducing IN-processing for Unsupervised Representation Learning on Graphs [10.042608422528392]
We propose CAFIN, a centrality-aware fairness-inducing framework to tune the representations generated by existing frameworks.
We deploy it on GraphSAGE and showcase its efficacy on two downstream tasks - Node Classification and Link Prediction.
arXiv Detail & Related papers (2023-04-10T05:40:09Z) - Collaborative Mean Estimation over Intermittently Connected Networks
with Peer-To-Peer Privacy [86.61829236732744]
This work considers the problem of Distributed Mean Estimation (DME) over networks with intermittent connectivity.
The goal is to learn a global statistic over the data samples localized across distributed nodes with the help of a central server.
We study the tradeoff between collaborative relaying and privacy leakage due to the additional data sharing among nodes.
arXiv Detail & Related papers (2023-02-28T19:17:03Z) - Byzantine-Robust Decentralized Learning via ClippedGossip [61.03711813598128]
We propose a ClippedGossip algorithm for Byzantine-robust consensus optimization.
We demonstrate the encouraging empirical performance of ClippedGossip under a large number of attacks.
arXiv Detail & Related papers (2022-02-03T12:04:36Z) - Blockchain Assisted Decentralized Federated Learning (BLADE-FL):
Performance Analysis and Resource Allocation [119.19061102064497]
We propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL)
In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round.
We explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients.
arXiv Detail & Related papers (2021-01-18T07:19:08Z) - Privacy Amplification by Decentralization [0.0]
We introduce a novel relaxation of local differential privacy (LDP) that naturally arises in fully decentralized protocols.
We study a decentralized model of computation where a token performs a walk on the network graph and is updated sequentially by the party who receives it.
We prove that the privacy-utility trade-offs of our algorithms significantly improve upon LDP, and in some cases even match what can be achieved with methods based on trusted/secure aggregation and shuffling.
arXiv Detail & Related papers (2020-12-09T21:33:33Z) - Quantized Decentralized Stochastic Learning over Directed Graphs [54.005946490293496]
We consider a decentralized learning problem where data points are distributed among computing nodes communicating over a directed graph.<n>As the model size gets large, decentralized learning faces a major bottleneck that is the communication load due to each node transmitting messages (model updates) to its neighbors.<n>We propose the quantized decentralized learning algorithm over directed graphs that is based on the push-sum algorithm in decentralized consensus optimization.
arXiv Detail & Related papers (2020-02-23T18:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.