Related papers: Deep Learning and Traffic Classification: Lessons learned from a commercial-grade dataset with hundreds of encrypted and zero-day applications

Deep Learning and Traffic Classification: Lessons learned from a commercial-grade dataset with hundreds of encrypted and zero-day applications

URL: http://arxiv.org/abs/2104.03182v1
Date: Wed, 7 Apr 2021 15:21:22 GMT
Title: Deep Learning and Traffic Classification: Lessons learned from a commercial-grade dataset with hundreds of encrypted and zero-day applications
Authors: Lixuan Yang, Alessandro Finamore, Feng Jun, Dario Rossi
Abstract summary: We share our experience on a commercial-grade DL traffic classification engine. We identify known applications from encrypted traffic, as well as unknown zero-day applications. We propose a novel technique, tailored for DL models, that is significantly more accurate and light-weight than the state of the art.
Score: 72.02908263225919
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The increasing success of Machine Learning (ML) and Deep Learning (DL) has recently re-sparked interest towards traffic classification. While classification of known traffic is a well investigated subject with supervised classification tools (such as ML and DL models) are known to provide satisfactory performance, detection of unknown (or zero-day) traffic is more challenging and typically handled by unsupervised techniques (such as clustering algorithms). In this paper, we share our experience on a commercial-grade DL traffic classification engine that is able to (i) identify known applications from encrypted traffic, as well as (ii) handle unknown zero-day applications. In particular, our contribution for (i) is to perform a thorough assessment of state of the art traffic classifiers in commercial-grade settings comprising few thousands of very fine grained application labels, as opposite to the few tens of classes generally targeted in academic evaluations. Additionally, we contribute to the problem of (ii) detection of zero-day applications by proposing a novel technique, tailored for DL models, that is significantly more accurate and light-weight than the state of the art. Summarizing our main findings, we gather that (i) while ML and DL models are both equally able to provide satisfactory solution for classification of known traffic, however (ii) the non-linear feature extraction process of the DL backbone provides sizeable advantages for the detection of unknown classes.

Related papers

Overtake Detection in Trucks Using CAN Bus Signals: A Comparative Study of Machine Learning Methods [51.28632782308621]
We focus on overtake detection using Controller Area Network (CAN) bus data collected from five in-service trucks provided by the Volvo Group.<n>We evaluate three common classifiers for vehicle manoeuvre detection, Artificial Neural Networks (ANN), Random Forest (RF), and Support Vector Machines (SVM)<n>Our pertruck analysis also reveals that classification accuracy, especially for overtakes, depends on the amount of training data per vehicle.
arXiv Detail & Related papers (2025-07-01T09:20:41Z)
Enhancing Visual Continual Learning with Language-Guided Supervision [76.38481740848434]
Continual learning aims to empower models to learn new tasks without forgetting previously acquired knowledge. We argue that the scarce semantic information conveyed by the one-hot labels hampers the effective knowledge transfer across tasks. Specifically, we use PLMs to generate semantic targets for each class, which are frozen and serve as supervision signals.
arXiv Detail & Related papers (2024-03-24T12:41:58Z)
Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings. We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z)
Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models. We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology. While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z)
Open-Source Framework for Encrypted Internet and Malicious Traffic Classification [4.495583520377878]
Internet traffic classification plays a key role in network visibility, Quality of Services (QoS), intrusion detection, Quality of Experience (QoE) and traffic-trend analyses. In this paper, we propose an open-source framework, named OSF-EIMTC, which can provide the full pipeline of the learning process.
arXiv Detail & Related papers (2022-06-21T07:01:57Z)
When a RF Beats a CNN and GRU, Together -- A Comparison of Deep Learning and Classical Machine Learning Approaches for Encrypted Malware Traffic Classification [4.495583520377878]
We show that in the case of malicious traffic classification, state-of-the-art DL-based solutions do not necessarily outperform the classical ML-based ones. We exemplify this finding using two well-known datasets for a varied set of tasks, such as: malware detection, malware family classification, detection of zero-day attacks, and classification of an iteratively growing dataset.
arXiv Detail & Related papers (2022-06-16T08:59:53Z)
Knowledge-driven Active Learning [70.37119719069499]
Active learning strategies aim at minimizing the amount of labelled data required to train a Deep Learning model. Most active strategies are based on uncertain sample selection, and even often restricted to samples lying close to the decision boundary. Here we propose to take into consideration common domain-knowledge and enable non-expert users to train a model with fewer samples.
arXiv Detail & Related papers (2021-10-15T06:11:53Z)
A First Look at Class Incremental Learning in Deep Learning Mobile Traffic Classification [68.11005070665364]
We explore Incremental Learning (IL) techniques to add new classes to models without a full retraining, hence speeding up model's updates cycle. We consider iCarl, a state of the art IL method, and MIRAGE-2019, a public dataset with traffic from 40 Android apps. Despite our analysis reveals their infancy, IL techniques are a promising research area on the roadmap towards automated DL-based traffic analysis systems.
arXiv Detail & Related papers (2021-07-09T14:28:16Z)
Transfer Learning for Aided Target Recognition: Comparing Deep Learning to other Machine Learning Approaches [0.0]
Aided target recognition (AiTR) is an important problem with applications across industry and defense. Deep learning (DL) provides exceptional modeling flexibility and accuracy on recent real world problems. Our goal is to address this shortcoming by comparing transfer learning within a DL framework to other ML approaches across transfer tasks and datasets.
arXiv Detail & Related papers (2020-11-25T14:25:49Z)
DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification [4.187494796512101]
We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic. We show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.
arXiv Detail & Related papers (2020-03-03T16:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.