Convolutions are Competitive with Transformers for Encrypted Traffic Classification with Pre-training
- URL: http://arxiv.org/abs/2508.02001v1
- Date: Mon, 04 Aug 2025 02:42:44 GMT
- Title: Convolutions are Competitive with Transformers for Encrypted Traffic Classification with Pre-training
- Authors: Chungang Lin, Weiyao Zhang, Tianyu Zuo, Chao Zha, Yilong Jiang, Ruiqi Meng, Haitong Luo, Xuying Meng, Yujun Zhang,
- Abstract summary: We investigate whether convolutions, with linear complexity and implicit positional encoding, are competitive with Transformers in encrypted traffic classification with pre-training.<n>We propose NetConv, a novel pre-trained convolution model for encrypted traffic classification.<n>We show that NetConv improves average classification performance by 6.88% and model throughput by 7.41X over existing pre-trained models.
- Score: 12.150588010496165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Encrypted traffic classification is vital for modern network management and security. To reduce reliance on handcrafted features and labeled data, recent methods focus on learning generic representations through pre-training on large-scale unlabeled data. However, current pre-trained models face two limitations originating from the adopted Transformer architecture: (1) Limited model efficiency due to the self-attention mechanism with quadratic complexity; (2) Unstable traffic scalability to longer byte sequences, as the explicit positional encodings fail to generalize to input lengths not seen during pre-training. In this paper, we investigate whether convolutions, with linear complexity and implicit positional encoding, are competitive with Transformers in encrypted traffic classification with pre-training. We first conduct a systematic comparison, and observe that convolutions achieve higher efficiency and scalability, with lower classification performance. To address this trade-off, we propose NetConv, a novel pre-trained convolution model for encrypted traffic classification. NetConv employs stacked traffic convolution layers, which enhance the ability to capture localized byte-sequence patterns through window-wise byte scoring and sequence-wise byte gating. We design a continuous byte masking pre-training task to help NetConv learn protocol-specific patterns. Experimental results on four tasks demonstrate that NetConv improves average classification performance by 6.88% and model throughput by 7.41X over existing pre-trained models.
Related papers
- Adversarial Pre-Padding: Generating Evasive Network Traffic Against Transformer-Based Classifiers [7.283896930735145]
We propose a novel and effective underlineadversarial underlinetraffic-generating approach.<n>Our approach has two key innovations: (i) a pre-padding strategy is proposed to modify packets, which effectively overcomes the limitations of existing research against transformer-based models for network traffic classification; and (ii) a reinforcement learning model is employed to optimize network traffic perturbations.
arXiv Detail & Related papers (2025-10-29T09:45:27Z) - Language of Network: A Generative Pre-trained Model for Encrypted Traffic Comprehension [16.795038178588324]
Deep learning is currently the predominant approach for encrypted traffic classification through feature analysis.<n>We present GBC, a generative model based on pre-training for encrypted traffic comprehension.<n>It achieves superior results in both traffic classification and generation tasks, resulting in a 5% improvement in F1 score compared to state-of-the-art methods for classification tasks.
arXiv Detail & Related papers (2025-05-26T04:04:29Z) - A Pipeline of Augmentation and Sequence Embedding for Classification of Imbalanced Network Traffic [0.0]
We propose a pipeline to balance the dataset and classify it using a robust and accurate embedding technique.<n>We demonstrate that the proposed augmentation pipeline, combined with FS-Embedding, increases convergence speed and leads to a significant reduction in the number of model parameters.
arXiv Detail & Related papers (2025-02-26T07:55:24Z) - NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records.<n>We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z) - Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation [59.18151483767509]
We introduce a dual-path token lifting for domain shift correction in test time adaptation.
We then perform dual-path lifting with interleaved token prediction and update between the path of domain shift tokens and the path of class tokens.
Experimental results on the benchmark datasets demonstrate that our proposed method significantly improves the online fully test-time domain adaptation performance.
arXiv Detail & Related papers (2024-08-26T02:33:47Z) - One Train for Two Tasks: An Encrypted Traffic Classification Framework
Using Supervised Contrastive Learning [18.63871240173137]
We propose an effective model named a Contrastive Learning Enhanced Temporal Fusion (CLE-TFE)
In particular, we utilize supervised contrastive learning to enhance the packet-level and flow-level representations.
We also propose cross-level multi-task learning, which simultaneously accomplishes the packet-level and flow-level classification tasks in the same model with one training.
arXiv Detail & Related papers (2024-02-12T09:10:09Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Transformers for End-to-End InfoSec Tasks: A Feasibility Study [6.847381178288385]
We implement transformer models for two distinct InfoSec data formats - specifically URLs and PE files.
We show that our URL transformer model requires a different training approach to reach high performance levels.
We demonstrate that this approach performs comparably to well-established malware detection models on benchmark PE file datasets.
arXiv Detail & Related papers (2022-12-05T23:50:46Z) - Position Prediction as an Effective Pretraining Strategy [20.925906203643883]
We propose a novel, but surprisingly simple alternative to content reconstruction-- that of predicting locations from content, without providing positional information for it.
Our approach brings improvements over strong supervised training baselines and is comparable to modern unsupervised/self-supervised pretraining methods.
arXiv Detail & Related papers (2022-07-15T17:10:48Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.