Deep Learning and Traffic Classification: Lessons learned from a
commercial-grade dataset with hundreds of encrypted and zero-day applications
- URL: http://arxiv.org/abs/2104.03182v1
- Date: Wed, 7 Apr 2021 15:21:22 GMT
- Title: Deep Learning and Traffic Classification: Lessons learned from a
commercial-grade dataset with hundreds of encrypted and zero-day applications
- Authors: Lixuan Yang, Alessandro Finamore, Feng Jun, Dario Rossi
- Abstract summary: We share our experience on a commercial-grade DL traffic classification engine.
We identify known applications from encrypted traffic, as well as unknown zero-day applications.
We propose a novel technique, tailored for DL models, that is significantly more accurate and light-weight than the state of the art.
- Score: 72.02908263225919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing success of Machine Learning (ML) and Deep Learning (DL) has
recently re-sparked interest towards traffic classification. While
classification of known traffic is a well investigated subject with supervised
classification tools (such as ML and DL models) are known to provide
satisfactory performance, detection of unknown (or zero-day) traffic is more
challenging and typically handled by unsupervised techniques (such as
clustering algorithms).
In this paper, we share our experience on a commercial-grade DL traffic
classification engine that is able to (i) identify known applications from
encrypted traffic, as well as (ii) handle unknown zero-day applications. In
particular, our contribution for (i) is to perform a thorough assessment of
state of the art traffic classifiers in commercial-grade settings comprising
few thousands of very fine grained application labels, as opposite to the few
tens of classes generally targeted in academic evaluations. Additionally, we
contribute to the problem of (ii) detection of zero-day applications by
proposing a novel technique, tailored for DL models, that is significantly more
accurate and light-weight than the state of the art.
Summarizing our main findings, we gather that (i) while ML and DL models are
both equally able to provide satisfactory solution for classification of known
traffic, however (ii) the non-linear feature extraction process of the DL
backbone provides sizeable advantages for the detection of unknown classes.
Related papers
- Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge.
We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks.
Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning
in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models.
We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology.
While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z) - Open-Source Framework for Encrypted Internet and Malicious Traffic
Classification [4.495583520377878]
Internet traffic classification plays a key role in network visibility, Quality of Services (QoS), intrusion detection, Quality of Experience (QoE) and traffic-trend analyses.
In this paper, we propose an open-source framework, named OSF-EIMTC, which can provide the full pipeline of the learning process.
arXiv Detail & Related papers (2022-06-21T07:01:57Z) - When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning
and Classical Machine Learning Approaches for Encrypted Malware Traffic
Classification [4.495583520377878]
We show that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones.
We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset.
arXiv Detail & Related papers (2022-06-16T08:59:53Z) - Knowledge-driven Active Learning [70.37119719069499]
Active learning strategies aim at minimizing the amount of labelled data required to train a Deep Learning model.
Most active strategies are based on uncertain sample selection, and even often restricted to samples lying close to the decision boundary.
Here we propose to take into consideration common domain-knowledge and enable non-expert users to train a model with fewer samples.
arXiv Detail & Related papers (2021-10-15T06:11:53Z) - A First Look at Class Incremental Learning in Deep Learning Mobile
Traffic Classification [68.11005070665364]
We explore Incremental Learning (IL) techniques to add new classes to models without a full retraining, hence speeding up model's updates cycle.
We consider iCarl, a state of the art IL method, and MIRAGE-2019, a public dataset with traffic from 40 Android apps.
Despite our analysis reveals their infancy, IL techniques are a promising research area on the roadmap towards automated DL-based traffic analysis systems.
arXiv Detail & Related papers (2021-07-09T14:28:16Z) - Transfer Learning for Aided Target Recognition: Comparing Deep Learning
to other Machine Learning Approaches [0.0]
Aided target recognition (AiTR) is an important problem with applications across industry and defense.
Deep learning (DL) provides exceptional modeling flexibility and accuracy on recent real world problems.
Our goal is to address this shortcoming by comparing transfer learning within a DL framework to other ML approaches across transfer tasks and datasets.
arXiv Detail & Related papers (2020-11-25T14:25:49Z) - DeepMAL -- Deep Learning Models for Malware Traffic Detection and
Classification [4.187494796512101]
We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic.
We show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.
arXiv Detail & Related papers (2020-03-03T16:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.