Related papers: When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning and Classical Machine Learning Approaches for Encrypted Malware Traffic Classification

When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning and Classical Machine Learning Approaches for Encrypted Malware Traffic Classification

URL: http://arxiv.org/abs/2206.08004v1
Date: Thu, 16 Jun 2022 08:59:53 GMT
Title: When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning and Classical Machine Learning Approaches for Encrypted Malware Traffic Classification
Authors: Adi Lichy, Ofek Bader, Ran Dubin, Amit Dvir, Chen Hajaj
Abstract summary: We show that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset.
Score: 4.495583520377878
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Internet traffic classification is widely used to facilitate network management. It plays a crucial role in Quality of Services (QoS), Quality of Experience (QoE), network visibility, intrusion detection, and traffic trend analyses. While there is no theoretical guarantee that deep learning (DL)-based solutions perform better than classic machine learning (ML)-based ones, DL-based models have become the common default. This paper compares well-known DL-based and ML-based models and shows that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset. Note that, it is not feasible to evaluate all possible models to make a concrete statement, thus, the above finding is not a recommendation to avoid DL-based models, but rather empirical proof that in some cases, there are more simplistic solutions, that may perform even better.

Related papers

CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks [54.04030169323115]
We introduce CREDIT, a certified ownership verification against Model Extraction Attacks (MEAs)<n>We quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold.<n>We extensively evaluate our approach on several mainstream datasets across different domains and tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2026-02-23T23:36:25Z)
How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities [62.474732677086855]
Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance. We propose the DSC benchmark: Diverse, Simple, and Categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types.
arXiv Detail & Related papers (2025-03-20T19:52:30Z)
Benchmarking Android Malware Detection: Rethinking the Role of Traditional and Deep Learning Models [6.9053043489744015]
Android malware detection has been extensively studied using both traditional machine learning (ML) and deep learning (DL) approaches. While many state-of-the-art detection models claim superior performance, they often rely on limited comparisons. This raises concerns about the robustness of DL-based approaches' performance and the potential oversight of simpler, more efficient ML models.
arXiv Detail & Related papers (2025-02-20T20:56:05Z)
A Survey of Malware Detection Using Deep Learning [6.349503549199403]
This paper investigates advances in malware detection on Windows, iOS, Android, and Linux using deep learning (DL) We discuss the issues and the challenges in malware detection using DL classifiers. We examine eight popular DL approaches on various datasets.
arXiv Detail & Related papers (2024-07-27T02:49:55Z)
Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models. We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology. While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z)
DOC-NAD: A Hybrid Deep One-class Classifier for Network Anomaly Detection [0.0]
Machine Learning approaches have been used to enhance the detection capabilities of Network Intrusion Detection Systems (NIDSs) Recent work has achieved near-perfect performance by following binary- and multi-class network anomaly detection tasks. This paper proposes a Deep One-Class (DOC) classifier for network intrusion detection by only training on benign network data samples.
arXiv Detail & Related papers (2022-12-15T00:08:05Z)
Open-Source Framework for Encrypted Internet and Malicious Traffic Classification [4.495583520377878]
Internet traffic classification plays a key role in network visibility, Quality of Services (QoS), intrusion detection, Quality of Experience (QoE) and traffic-trend analyses. In this paper, we propose an open-source framework, named OSF-EIMTC, which can provide the full pipeline of the learning process.
arXiv Detail & Related papers (2022-06-21T07:01:57Z)
Semantic Representation and Dependency Learning for Multi-Label Image Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category. Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model. We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z)
Enhancing the Generalization for Intent Classification and Out-of-Domain Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU) Recent works have shown that using extra data and labels can improve the OOD detection performance. This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Deep Learning and Traffic Classification: Lessons learned from a commercial-grade dataset with hundreds of encrypted and zero-day applications [72.02908263225919]
We share our experience on a commercial-grade DL traffic classification engine. We identify known applications from encrypted traffic, as well as unknown zero-day applications. We propose a novel technique, tailored for DL models, that is significantly more accurate and light-weight than the state of the art.
arXiv Detail & Related papers (2021-04-07T15:21:22Z)
Learning Adaptive Embedding Considering Incremental Class [55.21855842960139]
Class-Incremental Learning (CIL) aims to train a reliable model with the streaming data, which emerges unknown classes sequentially. Different from traditional closed set learning, CIL has two main challenges: 1) Novel class detection. After the novel classes are detected, the model needs to be updated without re-training using entire previous data.
arXiv Detail & Related papers (2020-08-31T04:11:24Z)
DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification [4.187494796512101]
We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic. We show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.
arXiv Detail & Related papers (2020-03-03T16:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.