Generic Multi-modal Representation Learning for Network Traffic Analysis
- URL: http://arxiv.org/abs/2405.02649v1
- Date: Sat, 4 May 2024 12:24:29 GMT
- Title: Generic Multi-modal Representation Learning for Network Traffic Analysis
- Authors: Luca Gioacchini, Idilio Drago, Marco Mellia, Zied Ben Houidi, Dario Rossi,
- Abstract summary: Network traffic analysis is fundamental for network management, troubleshooting, and security.
We propose a flexible Multi-modal Autoencoder (MAE) pipeline that can solve different use cases.
We argue that the MAE architecture is generic and can be used to learn representations useful in multiple scenarios.
- Score: 6.372999570085887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network traffic analysis is fundamental for network management, troubleshooting, and security. Tasks such as traffic classification, anomaly detection, and novelty discovery are fundamental for extracting operational information from network data and measurements. We witness the shift from deep packet inspection and basic machine learning to Deep Learning (DL) approaches where researchers define and test a custom DL architecture designed for each specific problem. We here advocate the need for a general DL architecture flexible enough to solve different traffic analysis tasks. We test this idea by proposing a DL architecture based on generic data adaptation modules, followed by an integration module that summarises the extracted information into a compact and rich intermediate representation (i.e. embeddings). The result is a flexible Multi-modal Autoencoder (MAE) pipeline that can solve different use cases. We demonstrate the architecture with traffic classification (TC) tasks since they allow us to quantitatively compare results with state-of-the-art solutions. However, we argue that the MAE architecture is generic and can be used to learn representations useful in multiple scenarios. On TC, the MAE performs on par or better than alternatives while avoiding cumbersome feature engineering, thus streamlining the adoption of DL solutions for traffic analysis.
Related papers
- Lens: A Foundation Model for Network Traffic [19.3652490585798]
Lens is a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data.
We design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP)
arXiv Detail & Related papers (2024-02-06T02:45:13Z) - Modular Blended Attention Network for Video Question Answering [1.131316248570352]
We present an approach to facilitate the question with a reusable and composable neural unit.
We have conducted experiments on three commonly used datasets.
arXiv Detail & Related papers (2023-11-02T14:22:17Z) - Serving Deep Learning Model in Relational Databases [70.53282490832189]
Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains.
We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks.
The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS)
arXiv Detail & Related papers (2023-10-07T06:01:35Z) - Building Interpretable and Reliable Open Information Retriever for New
Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA)
We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query.
We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning
in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models.
We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology.
While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z) - Deep Learning Serves Traffic Safety Analysis: A Forward-looking Review [4.228522109021283]
We present a typical processing pipeline, which can be used to understand and interpret traffic videos.
This processing framework includes several steps, including video enhancement, video stabilization, semantic and incident segmentation, object detection and classification, trajectory extraction, speed estimation, event analysis, modeling and anomaly detection.
arXiv Detail & Related papers (2022-03-07T17:21:07Z) - Mapping the Internet: Modelling Entity Interactions in Complex
Heterogeneous Networks [0.0]
We propose a versatile, unified framework called HMill' for sample representation, model definition and training.
We show an extension of the universal approximation theorem to the set of all functions realized by models implemented in the framework.
We solve three different problems from the cybersecurity domain using the framework.
arXiv Detail & Related papers (2021-04-19T21:32:44Z) - Multi-intersection Traffic Optimisation: A Benchmark Dataset and a
Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas.
Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent.
We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z) - Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network.
We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.