Synthesizing Diverse Network Flow Datasets with Scalable Dynamic Multigraph Generation
- URL: http://arxiv.org/abs/2505.07777v1
- Date: Mon, 12 May 2025 17:26:48 GMT
- Title: Synthesizing Diverse Network Flow Datasets with Scalable Dynamic Multigraph Generation
- Authors: Arya Grayeli, Vipin Swarup, Steven E. Noel,
- Abstract summary: We introduce a novel machine learning model for generating high-fidelity synthetic network flow datasets.<n>Our results demonstrate improvements in accuracy over previous large-scale graph generation methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Obtaining real-world network datasets is often challenging because of privacy, security, and computational constraints. In the absence of such datasets, graph generative models become essential tools for creating synthetic datasets. In this paper, we introduce a novel machine learning model for generating high-fidelity synthetic network flow datasets that are representative of real-world networks. Our approach involves the generation of dynamic multigraphs using a stochastic Kronecker graph generator for structure generation and a tabular generative adversarial network for feature generation. We further employ an XGBoost (eXtreme Gradient Boosting) model for graph alignment, ensuring accurate overlay of features onto the generated graph structure. We evaluate our model using new metrics that assess both the accuracy and diversity of the synthetic graphs. Our results demonstrate improvements in accuracy over previous large-scale graph generation methods while maintaining similar efficiency. We also explore the trade-off between accuracy and diversity in synthetic graph dataset creation, a topic not extensively covered in related works. Our contributions include the synthesis and evaluation of large real-world netflow datasets and the definition of new metrics for evaluating synthetic graph generative models.
Related papers
- PROVCREATOR: Synthesizing Complex Heterogenous Graphs with Node and Edge Attributes [14.078355036170155]
We introduce ProvCreator, a synthetic graph framework for complex heterogeneous graphs.<n>ProvCreator formulates graph synthesis as a sequence generation task.<n>It features a versatile graph-to-sequence encoder-decoder that supports end-to-end, learnable graph generation.
arXiv Detail & Related papers (2025-07-28T16:22:50Z) - Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)<n>This framework provides a standardized setting to evaluate GNNs across diverse datasets.<n>We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - LLM-Based Multi-Agent Systems are Scalable Graph Generative Models [73.28294528654885]
GraphAgent-Generator (GAG) is a novel simulation-based framework for dynamic, text-attributed social graph generation.<n>GAG simulates the temporal node and edge generation processes for zero-shot social graph generation.<n>The resulting graphs exhibit adherence to seven key macroscopic network properties, achieving an 11% improvement in microscopic graph structure metrics.
arXiv Detail & Related papers (2024-10-13T12:57:08Z) - Data Augmentation in Graph Neural Networks: The Role of Generated Synthetic Graphs [0.24999074238880487]
This study explores using generated graphs for data augmentation.
It compares the performance of combining generated graphs with real graphs, and examining the effect of different quantities of generated graphs on graph classification tasks.
Our results introduce a new approach to graph data augmentation, ensuring consistent labels and enhancing classification performance.
arXiv Detail & Related papers (2024-07-20T06:05:26Z) - SynHING: Synthetic Heterogeneous Information Network Generation for Graph Learning and Explanation [31.89877722246351]
We introduce SynHING, a novel framework for Synthetic Heterogeneous Information Network Generation.
SynHING systematically identifies major motifs in a target HIN and employs a bottom-up generation process with intra-cluster and inter-cluster merge modules.
It provides ground-truth motifs for evaluating GNN explainer models, setting a new standard for explainable, synthetic HIN generation.
arXiv Detail & Related papers (2024-01-07T04:43:36Z) - GraphMaker: Can Diffusion Models Generate Large Attributed Graphs? [7.330479039715941]
Large-scale graphs with node attributes are increasingly common in various real-world applications.
Traditional graph generation methods are limited in their capacity to handle these complex structures.
This paper introduces a novel diffusion model, GraphMaker, specifically designed for generating large attributed graphs.
arXiv Detail & Related papers (2023-10-20T22:12:46Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - A Framework for Large Scale Synthetic Graph Dataset Generation [2.248608623448951]
This work proposes a scalable synthetic graph generation tool to scale the datasets to production-size graphs.
The tool learns a series of parametric models from proprietary datasets that can be released to researchers.
We demonstrate the generalizability of the framework across a series of datasets.
arXiv Detail & Related papers (2022-10-04T22:41:33Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Graph Auto-Encoders for Network Completion [6.1074304332419675]
We propose a model to use the learned pattern of connections from the observed part of the network to complete the whole graph.
Our proposed model achieved competitive performance with less information needed.
arXiv Detail & Related papers (2022-04-25T05:24:45Z) - Auto-decoding Graphs [91.3755431537592]
The generative model is an auto-decoder that learns to synthesize graphs from latent codes.
Graphs are synthesized using self-attention modules that are trained to identify likely connectivity patterns.
arXiv Detail & Related papers (2020-06-04T14:23:01Z) - Adaptive Graph Auto-Encoder for General Data Clustering [90.8576971748142]
Graph-based clustering plays an important role in the clustering area.
Recent studies about graph convolution neural networks have achieved impressive success on graph type data.
We propose a graph auto-encoder for general data clustering, which constructs the graph adaptively according to the generative perspective of graphs.
arXiv Detail & Related papers (2020-02-20T10:11:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.