Synthetic generation of online social networks through homophily
- URL: http://arxiv.org/abs/2509.02762v1
- Date: Tue, 02 Sep 2025 19:09:08 GMT
- Title: Synthetic generation of online social networks through homophily
- Authors: Alejandro Buitrago López, Javier Pastor-Galindo, José A. Ruipérez-Valiente,
- Abstract summary: This work proposes a homophily-based algorithm that produces synthetic microblogging social networks such as X.<n>The framework is validated by generating synthetic OSNs at four scales (103-106 nodes), and benchmarking them against a real-world Bluesky network comprising 4 million users.
- Score: 44.05636851266385
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Online social networks (OSNs) have become increasingly relevant for studying social behavior and information diffusion. Nevertheless, they are limited by restricted access to real OSN data due to privacy, legal, and platform-related constraints. In response, synthetic social networks serve as a viable approach to support controlled experimentation, but current generators reproduce only topology and overlook attribute-driven homophily and semantic realism. This work proposes a homophily-based algorithm that produces synthetic microblogging social networks such as X. The model creates a social graph for a given number of users, integrating semantic affinity among user attributes, stochastic variation in link formation, triadic closure to foster clustering, and long-range connections to ensure global reachability. A systematic grid search is used to calibrate five hyperparameters (affinity strength, noise, closure probability, distant link probability, and candidate pool size) for reaching five structural values observed in real social networks (density, clustering coefficient, LCC proportion, normalized shortest path, and modularity). The framework is validated by generating synthetic OSNs at four scales (10^3-10^6 nodes), and benchmarking them against a real-world Bluesky network comprising 4 million users. Comparative results show that the framework reliably reproduces the structural properties of the real network. Overall, the framework outperforms leading importance-sampling techniques applied to the same baseline. The generated graphs capture topological realism and yield attribute-driven communities that align with sociological expectations, providing a realistic, scalable testbed that liberates social researchers from relying on live digital platforms.
Related papers
- Agent-based simulation of online social networks and disinformation [35.38015952060615]
This paper presents a simulation framework that models synthetic social networks with agents endowed with demographic-based personality traits and finite-state behavioral automata.<n>A generative module powered by a large language model (LLM) produces context-aware social media posts consistent with each agent's profile and memory.<n>A red module implements DISARM-inspired disinformation campaigns executed by malicious agents targeting simulated audiences.
arXiv Detail & Related papers (2025-12-26T16:56:45Z) - FedHERO: A Federated Learning Approach for Node Classification Task on Heterophilic Graphs [55.51300642911766]
Federated Graph Learning (FGL) empowers clients to collaboratively train Graph neural networks (GNNs) in a distributed manner.<n>FGL methods usually require that the graph data owned by all clients is homophilic to ensure similar neighbor distribution patterns of nodes.<n>We propose FedHERO, an FGL framework designed to harness and share insights from heterophilic graphs effectively.
arXiv Detail & Related papers (2025-04-29T22:23:35Z) - LLM-Based Multi-Agent Systems are Scalable Graph Generative Models [73.28294528654885]
GraphAgent-Generator (GAG) is a novel simulation-based framework for dynamic, text-attributed social graph generation.<n>GAG simulates the temporal node and edge generation processes for zero-shot social graph generation.<n>The resulting graphs exhibit adherence to seven key macroscopic network properties, achieving an 11% improvement in microscopic graph structure metrics.
arXiv Detail & Related papers (2024-10-13T12:57:08Z) - BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers [34.517897902575946]
BotDGT is a novel framework that considers the topological structure, but also effectively incorporates dynamic nature of social network.
A structural module is employed to acquire topological information from each historical snapshot.
A temporal module is proposed to integrate historical context and model the evolving behavior patterns exhibited by social bots and legitimate users.
arXiv Detail & Related papers (2024-04-23T14:19:13Z) - Learning Persistent Community Structures in Dynamic Networks via
Topological Data Analysis [2.615648035076649]
We propose a novel deep graph clustering framework with temporal consistency regularization on inter-community structures.
MFC is a matrix factorization-based deep graph clustering algorithm that preserves node embedding.
TopoReg is introduced to ensure the preservation of topological similarity between inter-community structures over time intervals.
arXiv Detail & Related papers (2024-01-06T11:29:19Z) - Adversarial Socialbots Modeling Based on Structural Information
Principles [24.339397435628214]
Socialbots imitate human behavior to propagate misinformation, leading to an ongoing competition between socialbots and detectors.
We propose a mathematical Structural Information principles-based Adversarial Socialbots Modeling framework, namely SIASM, to enable more accurate and effective modeling of adversarial behaviors.
arXiv Detail & Related papers (2023-12-13T12:32:12Z) - Self-supervised Hypergraph Representation Learning for Sociological
Analysis [52.514283292498405]
We propose a fundamental methodology to support the further fusion of data mining techniques and sociological behavioral criteria.
First, we propose an effective hypergraph awareness and a fast line graph construction framework.
Second, we propose a novel hypergraph-based neural network to learn social influence flowing from users to users.
arXiv Detail & Related papers (2022-12-22T01:20:29Z) - Realistic Synthetic Social Networks with Graph Neural Networks [1.8275108630751837]
We evaluate the potential of Graph Neural Network (GNN) models for network generation for synthetic social networks.
We include social network specific measurements which allow evaluation of how realistically synthetic networks behave.
We find that the Gated Recurrent Attention Network (GRAN) extends well to social networks, and in comparison to a benchmark popular rule-based generation Recursive-MATrix (R-MAT) method, is better able to replicate realistic structural dynamics.
arXiv Detail & Related papers (2022-12-15T14:04:27Z) - BScNets: Block Simplicial Complex Neural Networks [79.81654213581977]
Simplicial neural networks (SNN) have recently emerged as the newest direction in graph learning.
We present Block Simplicial Complex Neural Networks (BScNets) model for link prediction.
BScNets outperforms state-of-the-art models by a significant margin while maintaining low costs.
arXiv Detail & Related papers (2021-12-13T17:35:54Z) - Amortized Probabilistic Detection of Communities in Graphs [39.56798207634738]
We propose a simple framework for amortized community detection.
We combine the expressive power of GNNs with recent methods for amortized clustering.
We evaluate several models from our framework on synthetic and real datasets.
arXiv Detail & Related papers (2020-10-29T16:18:48Z) - Multi-level Graph Convolutional Networks for Cross-platform Anchor Link
Prediction [47.047999403900775]
Cross-platform account matching plays a significant role in social network analytics.
We propose a novel framework that considers multi-level graph convolutions on both local network structure and hypergraph structure.
The proposed method overcomes data insufficiency problem of existing work and does not necessarily rely on user demographic information.
arXiv Detail & Related papers (2020-06-02T22:01:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.