Related papers: Machine Learning for Encrypted Malicious Traffic Detection: Approaches, Datasets and Comparative Study

Machine Learning for Encrypted Malicious Traffic Detection: Approaches, Datasets and Comparative Study

URL: http://arxiv.org/abs/2203.09332v1
Date: Thu, 17 Mar 2022 14:00:55 GMT
Title: Machine Learning for Encrypted Malicious Traffic Detection: Approaches, Datasets and Comparative Study
Authors: Zihao Wang, Kar-Wai Fok, Vrizlynn L. L. Thing
Abstract summary: In post-COVID-19 environment, malicious traffic encryption is growing rapidly. We formulate a universal framework of machine learning based encrypted malicious traffic detection techniques. We implement and compare 10 encrypted malicious traffic detection algorithms.
Score: 6.267890584151111
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: As people's demand for personal privacy and data security becomes a priority, encrypted traffic has become mainstream in the cyber world. However, traffic encryption is also shielding malicious and illegal traffic introduced by adversaries, from being detected. This is especially so in the post-COVID-19 environment where malicious traffic encryption is growing rapidly. Common security solutions that rely on plain payload content analysis such as deep packet inspection are rendered useless. Thus, machine learning based approaches have become an important direction for encrypted malicious traffic detection. In this paper, we formulate a universal framework of machine learning based encrypted malicious traffic detection techniques and provided a systematic review. Furthermore, current research adopts different datasets to train their models due to the lack of well-recognized datasets and feature sets. As a result, their model performance cannot be compared and analyzed reliably. Therefore, in this paper, we analyse, process and combine datasets from 5 different sources to generate a comprehensive and fair dataset to aid future research in this field. On this basis, we also implement and compare 10 encrypted malicious traffic detection algorithms. We then discuss challenges and propose future directions of research.

Related papers

DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective [59.66984417026933]
We introduce a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing)<n>We formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset.<n>Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery.<n>Our benchmark, DATABench, comprises 17 evasion attacks, 5 forgery attacks, and 9
arXiv Detail & Related papers (2025-07-08T03:07:15Z)
Language of Network: A Generative Pre-trained Model for Encrypted Traffic Comprehension [16.795038178588324]
Deep learning is currently the predominant approach for encrypted traffic classification through feature analysis.<n>We present GBC, a generative model based on pre-training for encrypted traffic comprehension.<n>It achieves superior results in both traffic classification and generation tasks, resulting in a 5% improvement in F1 score compared to state-of-the-art methods for classification tasks.
arXiv Detail & Related papers (2025-05-26T04:04:29Z)
Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem. These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem. We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z)
Application of Machine Learning Techniques for Secure Traffic in NoC-based Manycores [44.99833362998488]
This document explores an IDS technique using machine learning and temporal series for detecting DoS attacks in NoC-based manycore systems. It is necessary to extract traffic data from a manycore NoC and execute the learning techniques in the extracted data. The developed platform will have its data validated with a low-level platform.
arXiv Detail & Related papers (2025-01-21T10:58:09Z)
Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic [0.0]
Encrypted network communication ensures confidentiality, integrity, and privacy between endpoints. In this study, we investigate the integration of explainable artificial intelligence (XAI) techniques to detect malicious network traffic. We employ ensemble learning models to identify malicious activity using multi-view features extracted from various aspects of encrypted communication.
arXiv Detail & Related papers (2025-01-09T17:21:00Z)
Multi-Source Urban Traffic Flow Forecasting with Drone and Loop Detector Data [61.9426776237409]
Drone-captured data can create an accurate multi-sensor mobility observatory for large-scale urban networks. A simple yet effective graph-based model HiMSNet is proposed to integrate multiple data modalities and learn-temporal correlations.
arXiv Detail & Related papers (2025-01-07T03:23:28Z)
Preliminary study on artificial intelligence methods for cybersecurity threat detection in computer networks based on raw data packets [34.82692226532414]
In this paper, we investigate deep learning methodologies capable of detecting attacks in real-time directly from raw packet data within network traffic. We propose a novel approach where packets are stacked into windows and separately recognised, with a 2D image representation suitable for processing with computer vision models.
arXiv Detail & Related papers (2024-07-24T15:04:00Z)
Lens: A Foundation Model for Network Traffic [19.3652490585798]
Lens is a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data. We design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP)
arXiv Detail & Related papers (2024-02-06T02:45:13Z)
Feature Analysis of Encrypted Malicious Traffic [3.3148826359547514]
In recent years there has been a dramatic increase in the number of malware attacks that use encrypted HTTP traffic for self-propagation or communication. Antivirus software and firewalls typically will not have access to encryption keys, and therefore direct detection of encrypted data is unlikely to succeed. Previous work has shown that traffic analysis can provide indications of malicious intent, even in cases where the underlying data remains encrypted.
arXiv Detail & Related papers (2023-12-06T12:04:28Z)
CRYPTO-MINE: Cryptanalysis via Mutual Information Neural Estimation [42.481750913003204]
Mutual Information (MI) is a measure to evaluate the efficiency of cryptosystems. Recent advances in machine learning have enabled progress in estimating MI using neural networks. This work presents a novel application of MI estimation in the field of cryptography.
arXiv Detail & Related papers (2023-09-14T20:30:04Z)
Feature Mining for Encrypted Malicious Traffic Detection with Deep Learning and Other Machine Learning Algorithms [7.404682407709988]
The popularity of encryption mechanisms poses a great challenge to malicious traffic detection. Traditional detection techniques cannot work without the decryption of encrypted traffic. In this paper, we provide an in-depth analysis of traffic features and compare different state-of-the-art traffic feature creation approaches. We propose a novel concept for encrypted traffic feature which is specifically designed for encrypted malicious traffic analysis.
arXiv Detail & Related papers (2023-04-07T15:25:36Z)
Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z)
Decoder Fusion RNN: Context and Interaction Aware Decoders for Trajectory Prediction [53.473846742702854]
We propose a recurrent, attention-based approach for motion forecasting. Decoder Fusion RNN (DF-RNN) is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder. We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.
arXiv Detail & Related papers (2021-08-12T15:53:37Z)
Malware Traffic Classification: Evaluation of Algorithms and an Automated Ground-truth Generation Pipeline [8.779666771357029]
We propose an automated packet data-labeling pipeline to generate ground-truth data. We explore and test different kind of clustering approaches which make use of unique and diverse set of features extracted from this observable meta-data.
arXiv Detail & Related papers (2020-10-22T11:48:51Z)
Federated Learning in Vehicular Networks [41.89469856322786]
Federated learning (FL) framework has been introduced as an efficient tool with the goal of reducing transmission overhead. In this paper, we investigate the usage of FL over centralized learning (CL) in vehicular network applications to develop intelligent transportation systems. We identify the major challenges from both learning perspective, i.e., data labeling and model training, and from the communications point of view, i.e., data rate, reliability, transmission overhead, privacy and resource management.
arXiv Detail & Related papers (2020-06-02T06:32:59Z)
Privacy-preserving Traffic Flow Prediction: A Federated Learning Approach [61.64006416975458]
We propose a privacy-preserving machine learning technique named Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction. FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism. It is shown that FedGRU's prediction accuracy is 90.96% higher than the advanced deep learning models.
arXiv Detail & Related papers (2020-03-19T13:07:49Z)
Survey of Network Intrusion Detection Methods from the Perspective of the Knowledge Discovery in Databases Process [63.75363908696257]
We review the methods that have been applied to network data with the purpose of developing an intrusion detector. We discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods. As a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security.
arXiv Detail & Related papers (2020-01-27T11:21:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.