Comparative Evaluation of VAE, GAN, and SMOTE for Tor Detection in Encrypted Network Traffic
- URL: http://arxiv.org/abs/2601.01183v1
- Date: Sat, 03 Jan 2026 13:31:53 GMT
- Title: Comparative Evaluation of VAE, GAN, and SMOTE for Tor Detection in Encrypted Network Traffic
- Authors: Saravanan A, Aswani Kumar Cherukuri,
- Abstract summary: Encrypted network traffic poses significant challenges for intrusion detection.<n>Traditional data augmentation methods struggle to preserve the complex temporal and statistical characteristics of real network traffic.<n>This work explores the use of Generative AI (GAI) models to synthesize realistic and diverse encrypted traffic traces.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Encrypted network traffic poses significant challenges for intrusion detection due to the lack of payload visibility, limited labeled datasets, and high class imbalance between benign and malicious activities. Traditional data augmentation methods struggle to preserve the complex temporal and statistical characteristics of real network traffic. To address these issues, this work explores the use of Generative AI (GAI) models to synthesize realistic and diverse encrypted traffic traces. We evaluate three approaches: Variational Autoencoders (VAE), Generative Adversarial Networks (GAN), and SMOTE (Synthetic Minority Over-sampling Technique), each integrated with a preprocessing pipeline that includes feature selection and class balancing. The UNSW NB-15 dataset is used as the primary benchmark, focusing on Tor traffic as anomalies. We analyze statistical similarity between real and synthetic data, and assess classifier performance using metrics such as Accuracy, F1-score, and AUC-ROC. Results show that VAE-generated data provides the best balance between privacy and performance, while GANs offer higher fidelity but risk overfitting. SMOTE, though simple, enhances recall but may lack diversity. The findings demonstrate that GAI methods can significantly improve encrypted traffic detection when trained with privacy-preserving synthetic data.
Related papers
- Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis [50.20709408241935]
This paper proposes inspecting the fully data-driven DeepSIC detection within a Network-of-MLPs architecture.<n>Within such an architecture, DeepSIC can be upgraded as a graph-based message-passing process using Graph Neural Networks (GNNs)<n>GNNSIC achieves excellent expressivity comparable to DeepSIC with substantially fewer trainable parameters.
arXiv Detail & Related papers (2026-02-13T04:38:51Z) - ReGAIN: Retrieval-Grounded AI Framework for Network Traffic Analysis [5.887997322139195]
ReGAIN is a framework that combines traffic summarization, retrieval-augmented generation (RAG), and Large Language Model (LLM) reasoning for transparent and accurate network traffic analysis.<n> evaluated on ICMP ping flood and TCP SYN flood traces from the real-world traffic dataset.
arXiv Detail & Related papers (2025-12-23T00:16:14Z) - Quantifying the Privacy Implications of High-Fidelity Synthetic Network Traffic [12.114570800461593]
We introduce a comprehensive set of privacy metrics for synthetic network traffic.<n>We evaluate the vulnerability of different representative generative models and examine the factors that influence attack success.<n>Our results reveal substantial variability in privacy risks across models and datasets.
arXiv Detail & Related papers (2025-11-25T17:04:02Z) - FlowXpert: Context-Aware Flow Embedding for Enhanced Traffic Detection in IoT Network [7.30584204219718]
In the Internet of Things (IoT) environment, continuous interaction among a large number of devices generates complex and dynamic network traffic.<n>Machine learning (ML)-based traffic detection technology serves as a critical component in ensuring network security.
arXiv Detail & Related papers (2025-09-25T07:52:58Z) - Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems [1.1265248232450553]
This paper proposes a self-supervised contrastive learning approach for generalizable intrusion detection on raw packet sequences.<n>Our framework exhibits better performance in comparison to existing NetFlow self-supervised methods.<n>Our model provides a strong baseline for supervised intrusion detection with limited labeled data.
arXiv Detail & Related papers (2025-05-12T13:42:00Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Robust Semi-supervised Federated Learning for Images Automatic
Recognition in Internet of Drones [57.468730437381076]
We present a Semi-supervised Federated Learning (SSFL) framework for privacy-preserving UAV image recognition.
There are significant differences in the number, features, and distribution of local data collected by UAVs using different camera modules.
We propose an aggregation rule based on the frequency of the client's participation in training, namely the FedFreq aggregation rule.
arXiv Detail & Related papers (2022-01-03T16:49:33Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Synthetic flow-based cryptomining attack generation through Generative
Adversarial Networks [1.2575897140677708]
Flow-based data sets are crucial to increase the performance of Machine Learning components.
Data privacy is appearing more and more as a strong requirement when processing such network data.
We propose a novel deterministic way to measure the quality of the synthetic data produced by a GAN.
arXiv Detail & Related papers (2021-07-30T17:27:55Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.