Quantifying the Privacy Implications of High-Fidelity Synthetic Network Traffic
- URL: http://arxiv.org/abs/2511.20497v1
- Date: Tue, 25 Nov 2025 17:04:02 GMT
- Title: Quantifying the Privacy Implications of High-Fidelity Synthetic Network Traffic
- Authors: Van Tran, Shinan Liu, Tian Li, Nick Feamster,
- Abstract summary: We introduce a comprehensive set of privacy metrics for synthetic network traffic.<n>We evaluate the vulnerability of different representative generative models and examine the factors that influence attack success.<n>Our results reveal substantial variability in privacy risks across models and datasets.
- Score: 12.114570800461593
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To address the scarcity and privacy concerns of network traffic data, various generative models have been developed to produce synthetic traffic. However, synthetic traffic is not inherently privacy-preserving, and the extent to which it leaks sensitive information, and how to measure such leakage, remain largely unexplored. This challenge is further compounded by the diversity of model architectures, which shape how traffic is represented and synthesized. We introduce a comprehensive set of privacy metrics for synthetic network traffic, combining standard approaches like membership inference attacks (MIA) and data extraction attacks with network-specific identifiers and attributes. Using these metrics, we systematically evaluate the vulnerability of different representative generative models and examine the factors that influence attack success. Our results reveal substantial variability in privacy risks across models and datasets. MIA success ranges from 0% to 88%, and up to 100% of network identifiers can be recovered from generated traffic, highlighting serious privacy vulnerabilities. We further identify key factors that significantly affect attack outcomes, including training data diversity and how well the generative model fits the training data. These findings provide actionable guidance for designing and deploying generative models that minimize privacy leakage, establishing a foundation for safer synthetic network traffic generation.
Related papers
- Quality Degradation Attack in Synthetic Data [5.461072909384133]
This study investigates quality attacks initiated by adversaries who possess access to the real dataset or control over the generation process.<n>We formalize a corresponding threat model and empirically evaluate the effectiveness of targeted manipulations of real data.
arXiv Detail & Related papers (2026-01-06T11:43:31Z) - Comparative Evaluation of VAE, GAN, and SMOTE for Tor Detection in Encrypted Network Traffic [0.0]
Encrypted network traffic poses significant challenges for intrusion detection.<n>Traditional data augmentation methods struggle to preserve the complex temporal and statistical characteristics of real network traffic.<n>This work explores the use of Generative AI (GAI) models to synthesize realistic and diverse encrypted traffic traces.
arXiv Detail & Related papers (2026-01-03T13:31:53Z) - PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling [0.0]
PHANTOM is a novel adversarial variational framework for generating high-fidelity synthetic attack data.<n>Its innovations include progressive training, a dual-path VAE-GAN architecture, and domain-specific feature matching to preserve the semantics of attacks.<n>Models trained on PHANTOM data achieve 98% weighted accuracy on real attacks.
arXiv Detail & Related papers (2025-12-12T18:14:19Z) - On the MIA Vulnerability Gap Between Private GANs and Diffusion Models [51.53790101362898]
Generative Adversarial Networks (GANs) and diffusion models have emerged as leading approaches for high-quality image synthesis.<n>We present the first unified theoretical and empirical analysis of the privacy risks faced by differentially private generative models.
arXiv Detail & Related papers (2025-09-03T14:18:22Z) - Privacy Auditing Synthetic Data Release through Local Likelihood Attacks [7.780592134085148]
Gene Likelihood Ratio Attack (Gen-LRA)<n>Gen-LRA formulates its attack by evaluating the influence a test observation has in a surrogate model's estimation of a local likelihood ratio over the synthetic data.<n>Results underscore Gen-LRA's effectiveness as a privacy auditing tool for the release of synthetic data.
arXiv Detail & Related papers (2025-08-28T18:27:40Z) - RL-Finetuned LLMs for Privacy-Preserving Synthetic Rewriting [17.294176570269]
We propose a reinforcement learning framework that fine-tunes a large language model (LLM) using a composite reward function.<n>The privacy reward combines semantic cues with structural patterns derived from a minimum spanning tree (MST) over latent representations.<n> Empirical results show that the proposed method significantly enhances author obfuscation and privacy metrics without degrading semantic quality.
arXiv Detail & Related papers (2025-08-25T04:38:19Z) - Deep Learning Models for Robust Facial Liveness Detection [56.08694048252482]
This study introduces a robust solution through novel deep learning models addressing the deficiencies in contemporary anti-spoofing techniques.<n>By innovatively integrating texture analysis and reflective properties associated with genuine human traits, our models distinguish authentic presence from replicas with remarkable precision.
arXiv Detail & Related papers (2025-08-12T17:19:20Z) - Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic [0.0]
Encrypted network communication ensures confidentiality, integrity, and privacy between endpoints.<n>In this study, we investigate the integration of explainable artificial intelligence (XAI) techniques to detect malicious network traffic.<n>We employ ensemble learning models to identify malicious activity using multi-view features extracted from various aspects of encrypted communication.
arXiv Detail & Related papers (2025-01-09T17:21:00Z) - KiNETGAN: Enabling Distributed Network Intrusion Detection through Knowledge-Infused Synthetic Data Generation [0.0]
We propose a knowledge-infused Generative Adversarial Network for generating synthetic network activity data (KiNETGAN)
Our approach enhances the resilience of distributed intrusion detection while addressing privacy concerns.
arXiv Detail & Related papers (2024-05-26T08:02:02Z) - PhishGuard: A Convolutional Neural Network Based Model for Detecting Phishing URLs with Explainability Analysis [1.102674168371806]
Phishing URL identification is the best way to address the problem.
Various machine learning and deep learning methods have been proposed to automate the detection of phishing URLs.
We propose a 1D Convolutional Neural Network (CNN) and trained the model with extensive features and a substantial amount of data.
arXiv Detail & Related papers (2024-04-27T17:13:49Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z) - Privacy-preserving Traffic Flow Prediction: A Federated Learning
Approach [61.64006416975458]
We propose a privacy-preserving machine learning technique named Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction.
FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism.
It is shown that FedGRU's prediction accuracy is 90.96% higher than the advanced deep learning models.
arXiv Detail & Related papers (2020-03-19T13:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.