Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems
- URL: http://arxiv.org/abs/2505.08816v1
- Date: Mon, 12 May 2025 13:42:00 GMT
- Title: Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems
- Authors: Ippokratis Koukoulis, Ilias Syrigos, Thanasis Korakis,
- Abstract summary: This paper proposes a self-supervised contrastive learning approach for generalizable intrusion detection on raw packet sequences.<n>Our framework exhibits better performance in comparison to existing NetFlow self-supervised methods.<n>Our model provides a strong baseline for supervised intrusion detection with limited labeled data.
- Score: 1.1265248232450553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As the digital landscape becomes more interconnected, the frequency and severity of zero-day attacks, have significantly increased, leading to an urgent need for innovative Intrusion Detection Systems (IDS). Machine Learning-based IDS that learn from the network traffic characteristics and can discern attack patterns from benign traffic offer an advanced solution to traditional signature-based IDS. However, they heavily rely on labeled datasets, and their ability to generalize when encountering unseen traffic patterns remains a challenge. This paper proposes a novel self-supervised contrastive learning approach based on transformer encoders, specifically tailored for generalizable intrusion detection on raw packet sequences. Our proposed learning scheme employs a packet-level data augmentation strategy combined with a transformer-based architecture to extract and generate meaningful representations of traffic flows. Unlike traditional methods reliant on handcrafted statistical features (NetFlow), our approach automatically learns comprehensive packet sequence representations, significantly enhancing performance in anomaly identification tasks and supervised learning for intrusion detection. Our transformer-based framework exhibits better performance in comparison to existing NetFlow self-supervised methods. Specifically, we achieve up to a 3% higher AUC in anomaly detection for intra-dataset evaluation and up to 20% higher AUC scores in inter-dataset evaluation. Moreover, our model provides a strong baseline for supervised intrusion detection with limited labeled data, exhibiting an improvement over self-supervised NetFlow models of up to 1.5% AUC when pretrained and evaluated on the same dataset. Additionally, we show the adaptability of our pretrained model when fine-tuned across different datasets, demonstrating strong performance even when lacking benign data from the target domain.
Related papers
- Dynamic Temporal Positional Encodings for Early Intrusion Detection in IoT [3.6686692131754834]
The rapid expansion of the Internet of Things (IoT) has introduced significant security challenges.<n>Traditional Intrusion Detection Systems (IDS) often overlook the temporal characteristics of network traffic.<n>We propose a Transformer-based Early Intrusion Detection System (EIDS) that incorporates dynamic temporal positional encodings.
arXiv Detail & Related papers (2025-06-22T17:56:19Z) - Graph Attention Neural Network for Botnet Detection: Evaluating Autoencoder, VAE and PCA-Based Dimension Reduction [0.0]
Graph Neural Networks (GNNs) address this limitation by learning an embedding space via iterative message passing.<n>This paper proposes a framework that first reduces dimensionality of the NetFlow-based IoT attack dataset before transforming it into a graph dataset.<n>We evaluate three dimension reduction techniques--Variational Autoencoder (VAE-encoder), classical autoencoder (AE-encoder), and Principal Component Analysis (PCA)
arXiv Detail & Related papers (2025-05-23T00:22:14Z) - Research on Cloud Platform Network Traffic Monitoring and Anomaly Detection System based on Large Language Models [5.524069089627854]
This paper introduces a large language model (LLM)-based network traffic monitoring and anomaly detection system.<n>A pre-trained large language model analyzes and predicts the probable network traffic, and an anomaly detection layer considers temporality and context.<n>Results show that the designed model outperforms traditional methods in detection accuracy and computational efficiency.
arXiv Detail & Related papers (2025-04-22T07:42:07Z) - NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records.<n>We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z) - FedMSE: Semi-supervised federated learning approach for IoT network intrusion detection [0.0]
The rise of IoT has expanded the cyber attack surface, making traditional centralized machine learning methods insufficient due to concerns about data availability, computational resources, transfer costs, and especially privacy preservation.<n>A semi-supervised federated learning model was developed to overcome these issues, combining the Shrink Autoencoder and Centroid one-class classifier (SAE-CEN)<n>This approach enhances the performance of intrusion detection by effectively representing normal network data and accurately identifying anomalies in the decentralized strategy.
arXiv Detail & Related papers (2024-10-18T02:23:57Z) - IoTGeM: Generalizable Models for Behaviour-Based IoT Attack Detection [3.3772986620114387]
We present an approach for modelling IoT network attacks that focuses on generalizability, yet also leads to better detection and performance.
First, we present an improved rolling window approach for feature extraction, and introduce a multi-step feature selection process that reduces overfitting.
Second, we build and test models using isolated train and test datasets, thereby avoiding common data leaks.
Third, we rigorously evaluate our methodology using a diverse portfolio of machine learning models, evaluation metrics and datasets.
arXiv Detail & Related papers (2023-10-17T21:46:43Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Robust Semi-supervised Federated Learning for Images Automatic
Recognition in Internet of Drones [57.468730437381076]
We present a Semi-supervised Federated Learning (SSFL) framework for privacy-preserving UAV image recognition.
There are significant differences in the number, features, and distribution of local data collected by UAVs using different camera modules.
We propose an aggregation rule based on the frequency of the client's participation in training, namely the FedFreq aggregation rule.
arXiv Detail & Related papers (2022-01-03T16:49:33Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Learning to Detect: A Data-driven Approach for Network Intrusion
Detection [17.288512506016612]
We perform a comprehensive study on NSL-KDD, a network traffic dataset, by visualizing patterns and employing different learning-based models to detect cyber attacks.
Unlike previous shallow learning and deep learning models that use the single learning model approach for intrusion detection, we adopt a hierarchy strategy.
We demonstrate the advantage of the unsupervised representation learning model in binary intrusion detection tasks.
arXiv Detail & Related papers (2021-08-18T21:19:26Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.