Fedivertex: a Graph Dataset based on Decentralized Social Networks for Trustworthy Machine Learning
- URL: http://arxiv.org/abs/2505.20882v1
- Date: Tue, 27 May 2025 08:26:50 GMT
- Title: Fedivertex: a Graph Dataset based on Decentralized Social Networks for Trustworthy Machine Learning
- Authors: Marc Damie, Edwige Cyffers,
- Abstract summary: We introduce Fedivertex, a new dataset of 182 graphs, covering seven social networks from the Fediverse.<n>We release the dataset along with a Python package to facilitate its use, and illustrate its utility on several tasks.
- Score: 2.8851756275902476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decentralized machine learning - where each client keeps its own data locally and uses its own computational resources to collaboratively train a model by exchanging peer-to-peer messages - is increasingly popular, as it enables better scalability and control over the data. A major challenge in this setting is that learning dynamics depend on the topology of the communication graph, which motivates the use of real graph datasets for benchmarking decentralized algorithms. Unfortunately, existing graph datasets are largely limited to for-profit social networks crawled at a fixed point in time and often collected at the user scale, where links are heavily influenced by the platform and its recommendation algorithms. The Fediverse, which includes several free and open-source decentralized social media platforms such as Mastodon, Misskey, and Lemmy, offers an interesting real-world alternative. We introduce Fedivertex, a new dataset of 182 graphs, covering seven social networks from the Fediverse, crawled weekly over 14 weeks. We release the dataset along with a Python package to facilitate its use, and illustrate its utility on several tasks, including a new defederation task, which captures a process of link deletion observed on these networks.
Related papers
- Hybrid FedGraph: An efficient hybrid federated learning algorithm using graph convolutional neural network [13.786989442742588]
Federated learning is an emerging paradigm for decentralized training of machine learning models on distributed clients.
We propose a graph convolutional neural network to capture feature-sharing information while learning features from a subset of clients.
We also develop a simple but effective clustering algorithm that aggregates features produced by the deep neural networks of each client while preserving data privacy.
arXiv Detail & Related papers (2024-04-15T04:02:39Z) - GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based
Histogram Intersection [51.608147732998994]
Graph neural networks are increasingly becoming the framework of choice for graph-based machine learning.
We propose a new graph neural network architecture that substitutes classical message passing with an analysis of the local distribution of node features.
arXiv Detail & Related papers (2024-01-17T13:04:23Z) - Redundancy-Free Self-Supervised Relational Learning for Graph Clustering [13.176413653235311]
We propose a novel self-supervised deep graph clustering method named Redundancy-Free Graph Clustering (R$2$FGC)
It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder.
Our experiments are performed on widely used benchmark datasets to validate the superiority of our R$2$FGC over state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-09T06:18:50Z) - Distributed Learning over Networks with Graph-Attention-Based
Personalization [49.90052709285814]
We propose a graph-based personalized algorithm (GATTA) for distributed deep learning.
In particular, the personalized model in each agent is composed of a global part and a node-specific part.
By treating each agent as one node in a graph the node-specific parameters as its features, the benefits of the graph attention mechanism can be inherited.
arXiv Detail & Related papers (2023-05-22T13:48:30Z) - Federated Graph-based Networks with Shared Embedding [1.323497585762675]
We propose Federated Graph-based Networks with Shared Embedding (Feras), which uses shared embedding data to train the network and avoids the direct sharing of original data.
Feras enables the training of current graph-based models in the federated learning framework for privacy concern.
arXiv Detail & Related papers (2022-10-03T12:51:15Z) - Exploiting Shared Representations for Personalized Federated Learning [54.65133770989836]
We propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.
Our algorithm harnesses the distributed computational power across clients to perform many local-updates with respect to the low-dimensional local parameters for every update of the representation.
This result is of interest beyond federated learning to a broad class of problems in which we aim to learn a shared low-dimensional representation among data distributions.
arXiv Detail & Related papers (2021-02-14T05:36:25Z) - BayGo: Joint Bayesian Learning and Information-Aware Graph Optimization [48.30183416069897]
BayGo is a novel fully decentralized joint Bayesian learning and graph optimization framework.
We show that our framework achieves faster convergence and higher accuracy compared to fully-connected and star topology graphs.
arXiv Detail & Related papers (2020-11-09T11:16:55Z) - Exploiting Heterogeneous Graph Neural Networks with Latent Worker/Task
Correlation Information for Label Aggregation in Crowdsourcing [72.34616482076572]
Crowdsourcing has attracted much attention for its convenience to collect labels from non-expert workers instead of experts.
We propose a novel framework based on graph neural networks for aggregating crowd labels.
arXiv Detail & Related papers (2020-10-25T10:12:37Z) - Community Detection Clustering via Gumbel Softmax [0.0]
We propose a method of community detection clustering the nodes of various graph datasets.
The deep learning role in modeling the interaction between nodes in a network allows a revolution in the field of science relevant to graph network analysis.
arXiv Detail & Related papers (2020-05-05T17:55:31Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.