When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning
and Classical Machine Learning Approaches for Encrypted Malware Traffic
Classification
- URL: http://arxiv.org/abs/2206.08004v1
- Date: Thu, 16 Jun 2022 08:59:53 GMT
- Title: When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning
and Classical Machine Learning Approaches for Encrypted Malware Traffic
Classification
- Authors: Adi Lichy, Ofek Bader, Ran Dubin, Amit Dvir, Chen Hajaj
- Abstract summary: We show that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones.
We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset.
- Score: 4.495583520377878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Internet traffic classification is widely used to facilitate network
management. It plays a crucial role in Quality of Services (QoS), Quality of
Experience (QoE), network visibility, intrusion detection, and traffic trend
analyses. While there is no theoretical guarantee that deep learning (DL)-based
solutions perform better than classic machine learning (ML)-based ones,
DL-based models have become the common default. This paper compares well-known
DL-based and ML-based models and shows that in the case of malicious traffic
classification, state-of-the-art DL-based solutions do not necessarily
outperform the classical ML-based ones. We exemplify this finding using two
well-known datasets for a varied set of tasks, such as: malware detection,
malware family classification, detection of zero-day attacks, and
classification of an iteratively growing dataset. Note that, it is not feasible
to evaluate all possible models to make a concrete statement, thus, the above
finding is not a recommendation to avoid DL-based models, but rather empirical
proof that in some cases, there are more simplistic solutions, that may perform
even better.
Related papers
- A Survey of Malware Detection Using Deep Learning [6.349503549199403]
This paper investigates advances in malware detection on Windows, iOS, Android, and Linux using deep learning (DL)
We discuss the issues and the challenges in malware detection using DL classifiers.
We examine eight popular DL approaches on various datasets.
arXiv Detail & Related papers (2024-07-27T02:49:55Z) - Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning
in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models.
We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology.
While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z) - DOC-NAD: A Hybrid Deep One-class Classifier for Network Anomaly
Detection [0.0]
Machine Learning approaches have been used to enhance the detection capabilities of Network Intrusion Detection Systems (NIDSs)
Recent work has achieved near-perfect performance by following binary- and multi-class network anomaly detection tasks.
This paper proposes a Deep One-Class (DOC) classifier for network intrusion detection by only training on benign network data samples.
arXiv Detail & Related papers (2022-12-15T00:08:05Z) - Open-Source Framework for Encrypted Internet and Malicious Traffic
Classification [4.495583520377878]
Internet traffic classification plays a key role in network visibility, Quality of Services (QoS), intrusion detection, Quality of Experience (QoE) and traffic-trend analyses.
In this paper, we propose an open-source framework, named OSF-EIMTC, which can provide the full pipeline of the learning process.
arXiv Detail & Related papers (2022-06-21T07:01:57Z) - Semantic Representation and Dependency Learning for Multi-Label Image
Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category.
Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model.
We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Deep Learning and Traffic Classification: Lessons learned from a
commercial-grade dataset with hundreds of encrypted and zero-day applications [72.02908263225919]
We share our experience on a commercial-grade DL traffic classification engine.
We identify known applications from encrypted traffic, as well as unknown zero-day applications.
We propose a novel technique, tailored for DL models, that is significantly more accurate and light-weight than the state of the art.
arXiv Detail & Related papers (2021-04-07T15:21:22Z) - Learning Adaptive Embedding Considering Incremental Class [55.21855842960139]
Class-Incremental Learning (CIL) aims to train a reliable model with the streaming data, which emerges unknown classes sequentially.
Different from traditional closed set learning, CIL has two main challenges: 1) Novel class detection.
After the novel classes are detected, the model needs to be updated without re-training using entire previous data.
arXiv Detail & Related papers (2020-08-31T04:11:24Z) - DeepMAL -- Deep Learning Models for Malware Traffic Detection and
Classification [4.187494796512101]
We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic.
We show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.
arXiv Detail & Related papers (2020-03-03T16:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.