Related papers: Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification

Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification

URL: http://arxiv.org/abs/2407.19051v1
Date: Fri, 26 Jul 2024 19:13:11 GMT
Title: Towards a Transformer-Based Pre-trained Model for IoT Traffic Classification
Authors: Bruna Bazaluk, Mosab Hamdan, Mustafa Ghaleb, Mohammed S. M. Gismalla, Flavio S. Correa da Silva, Daniel Macêdo Batista,
Abstract summary: State-of-the-art classification methods are based on Deep Learning. In real-life situations, where there is a scarce amount of IoT traffic data, the models would not perform so well. We propose IoT Traffic Classification Transformer (ITCT), which is pre-trained on a large labeled transformer-based IoT traffic dataset. Experiments demonstrated that ITCT model significantly outperforms existing models, achieving an overall accuracy of 82%.
Score: 0.6060461053918144
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The classification of IoT traffic is important to improve the efficiency and security of IoT-based networks. As the state-of-the-art classification methods are based on Deep Learning, most of the current results require a large amount of data to be trained. Thereby, in real-life situations, where there is a scarce amount of IoT traffic data, the models would not perform so well. Consequently, these models underperform outside their initial training conditions and fail to capture the complex characteristics of network traffic, rendering them inefficient and unreliable in real-world applications. In this paper, we propose IoT Traffic Classification Transformer (ITCT), a novel approach that utilizes the state-of-the-art transformer-based model named TabTransformer. ITCT, which is pre-trained on a large labeled MQTT-based IoT traffic dataset and may be fine-tuned with a small set of labeled data, showed promising results in various traffic classification tasks. Our experiments demonstrated that the ITCT model significantly outperforms existing models, achieving an overall accuracy of 82%. To support reproducibility and collaborative development, all associated code has been made publicly available.

Related papers

NetFlowGen: Leveraging Generative Pre-training for Network Traffic Dynamics [72.95483148058378]
We propose to pre-train a general-purpose machine learning model to capture traffic dynamics with only traffic data from NetFlow records. We address challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
arXiv Detail & Related papers (2024-12-30T00:47:49Z)
Modeling IoT Traffic Patterns: Insights from a Statistical Analysis of an MTC Dataset [1.2289361708127877]
Internet-of-Things (IoT) is rapidly expanding, connecting numerous devices and becoming integral to our daily lives. Effective IoT traffic management requires modeling and predicting intrincate machine-type communication (MTC) dynamics. We perform a comprehensive statistical analysis of the MTC traffic utilizing goodness-of-fit tests, including well-established tests such as Kolmogorov-Smirnov, Anderson-Darling, chi-squared, and root mean square error.
arXiv Detail & Related papers (2024-09-03T14:24:18Z)
Spatial-Temporal Attention Model for Traffic State Estimation with Sparse Internet of Vehicles [23.524936542317842]
We introduce a novel framework that utilizes sparse IoV data to achieve cost-effective traffic state estimation (TSE) Particularly, we propose a novel spatial-temporal attention model called the convolutional retentive network (CRNet) to improve the TSE accuracy. The model employs the convolutional neural network (CNN) for spatial correlation aggregation and the retentive network (RetNet) based on the attention mechanism to extract temporal correlations.
arXiv Detail & Related papers (2024-07-10T20:58:53Z)
Lens: A Foundation Model for Network Traffic [19.3652490585798]
Lens is a foundation model for network traffic that leverages the T5 architecture to learn the pre-trained representations from large-scale unlabeled data. We design a novel loss that combines three distinct tasks: Masked Span Prediction (MSP), Packet Order Prediction (POP), and Homologous Traffic Prediction (HTP)
arXiv Detail & Related papers (2024-02-06T02:45:13Z)
Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches. This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z)
Large-Scale Traffic Data Imputation with Spatiotemporal Semantic Understanding [26.86356769330179]
This study proposes Graph Transformer for Traffic Imputation (GT-TDI) model to impute large-scale traffic data with semantic understanding of a network. The proposed model takes incomplete data, social connectivity of sensors, and semantic descriptions as input to perform tasks with the help of Graph Neural Networks (GNN) and Transformer. The results show that proposed GT-TDI model outperforms existing methods in complex missing patterns and diverse missing rates.
arXiv Detail & Related papers (2023-01-27T13:02:19Z)
Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks. We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data. We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z)
FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and Federated Image Classification [47.24770508263431]
We develop FiLM Transfer (FiT) which fulfills requirements in the image classification setting. FiT uses an automatically configured Naive Bayes classifier on top of a fixed backbone that has been pretrained on large image datasets. We show that FiT achieves better classification accuracy than the state-of-the-art Big Transfer (BiT) algorithm at low-shot and on the challenging VTAB-1k benchmark.
arXiv Detail & Related papers (2022-06-17T10:17:20Z)
Efficient Federated Learning with Spike Neural Networks for Traffic Sign Recognition [70.306089187104]
We introduce powerful Spike Neural Networks (SNNs) into traffic sign recognition for energy-efficient and fast model training. Numerical results indicate that the proposed federated SNN outperforms traditional federated convolutional neural networks in terms of accuracy, noise immunity, and energy efficiency as well.
arXiv Detail & Related papers (2022-05-28T03:11:48Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Physics-Informed Deep Learning for Traffic State Estimation [3.779860024918729]
Traffic state estimation (TSE) reconstructs the traffic variables (e.g., density) on road segments using partially observed data. This paper introduces a physics-informed deep learning (PIDL) framework to efficiently conduct high-quality TSE with small amounts of observed data.
arXiv Detail & Related papers (2021-01-17T03:28:32Z)
Towards Accurate Knowledge Transfer via Target-awareness Representation Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED) TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model. Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.