Revolutionizing Cyber Threat Detection with Large Language Models: A
privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices
- URL: http://arxiv.org/abs/2306.14263v2
- Date: Thu, 8 Feb 2024 07:06:39 GMT
- Title: Revolutionizing Cyber Threat Detection with Large Language Models: A
privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices
- Authors: Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C.
Cordeiro, Merouane Debbah, Thierry Lestable, Narinderjit Singh Thandi
- Abstract summary: This paper presents SecurityBERT, a novel architecture that leverages the Bidirectional Representations from Transformers (BERT) model for cyber threat detection in IoT networks.
Our research demonstrates that SecurityBERT outperforms traditional Machine Learning (ML) and Deep Learning (DL) methods, such as Convolutional Neural Networks (CNNIoTs) or Recurrent Neural Networks (IoTRNNs) in cyber threat detection.
SecurityBERT achieved an impressive 98.2% overall accuracy in identifying fourteen distinct attack types, surpassing previous records set by hybrid solutions.
- Score: 3.340416780217405
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The field of Natural Language Processing (NLP) is currently undergoing a
revolutionary transformation driven by the power of pre-trained Large Language
Models (LLMs) based on groundbreaking Transformer architectures. As the
frequency and diversity of cybersecurity attacks continue to rise, the
importance of incident detection has significantly increased. IoT devices are
expanding rapidly, resulting in a growing need for efficient techniques to
autonomously identify network-based attacks in IoT networks with both high
precision and minimal computational requirements. This paper presents
SecurityBERT, a novel architecture that leverages the Bidirectional Encoder
Representations from Transformers (BERT) model for cyber threat detection in
IoT networks. During the training of SecurityBERT, we incorporated a novel
privacy-preserving encoding technique called Privacy-Preserving Fixed-Length
Encoding (PPFLE). We effectively represented network traffic data in a
structured format by combining PPFLE with the Byte-level Byte-Pair Encoder
(BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms
traditional Machine Learning (ML) and Deep Learning (DL) methods, such as
Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in
cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our
experimental analysis shows that SecurityBERT achieved an impressive 98.2%
overall accuracy in identifying fourteen distinct attack types, surpassing
previous records set by hybrid solutions such as GAN-Transformer-based
architectures and CNN-LSTM models. With an inference time of less than 0.15
seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT
is ideally suited for real-life traffic analysis and a suitable choice for
deployment on resource-constrained IoT devices.
Related papers
- CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks [4.836070911511429]
This paper proposes a novel network intrusion prediction framework that combines Large Language Models (LLMs) with Long Short Term Memory (LSTM) networks.
Our framework, evaluated on the CICIoT2023 IoT attack dataset, demonstrates a significant improvement in predictive capabilities, achieving an overall accuracy of 98%.
arXiv Detail & Related papers (2024-08-26T06:57:22Z) - Efficient Intrusion Detection: Combining $χ^2$ Feature Selection with CNN-BiLSTM on the UNSW-NB15 Dataset [2.239394800147746]
Intrusion Detection Systems (IDSs) have played a significant role in the detection and prevention of cyber-attacks in traditional computing systems.
The limited computational resources available on Internet of Things (IoT) devices pose a challenge for deploying conventional computing-based IDSs.
We present an effective IDS model that addresses this issue by combining a lightweight Convolutional Neural Network (CNN) with bidirectional Long Short-Term Memory (BiLSTM)
arXiv Detail & Related papers (2024-07-20T17:41:16Z) - A Cutting-Edge Deep Learning Method For Enhancing IoT Security [0.0]
This paper proposes an innovative design of the Internet of Things (IoT) Environment Intrusion Detection System (or IDS) using Deep Learning-integrated Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks.
Our model, based on the CICIDS 2017 dataset, achieved an accuracy of 99.52% in classifying network traffic as either benign or malicious.
arXiv Detail & Related papers (2024-06-18T08:42:51Z) - Lightweight CNN-BiLSTM based Intrusion Detection Systems for Resource-Constrained IoT Devices [38.16309790239142]
Intrusion Detection Systems (IDSs) have played a significant role in detecting and preventing cyber-attacks within traditional computing systems.
The limited computational resources available on Internet of Things (IoT) devices make it challenging to deploy conventional computing-based IDSs.
We propose a hybrid CNN architecture composed of a lightweight CNN and bidirectional LSTM (BiLSTM) to enhance the performance of IDS on the UNSW-NB15 dataset.
arXiv Detail & Related papers (2024-06-04T20:36:21Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Semantic Communication Enabling Robust Edge Intelligence for
Time-Critical IoT Applications [87.05763097471487]
This paper aims to design robust Edge Intelligence using semantic communication for time-critical IoT applications.
We analyze the effect of image DCT coefficients on inference accuracy and propose the channel-agnostic effectiveness encoding for offloading.
arXiv Detail & Related papers (2022-11-24T20:13:17Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Semi-supervised Variational Temporal Convolutional Network for IoT
Communication Multi-anomaly Detection [3.3659034873495632]
Internet of Things (IoT) devices are constructed to build a huge communications network.
These devices are insecure in reality, it means that the communications network are exposed by the attacker.
In this paper, we propose SS-VTCN, a semi-supervised network for IoT multiple anomaly detection.
arXiv Detail & Related papers (2021-04-05T08:51:24Z) - Enabling certification of verification-agnostic networks via
memory-efficient semidefinite programming [97.40955121478716]
We propose a first-order dual SDP algorithm that requires memory only linear in the total number of network activations.
We significantly improve L-inf verified robust accuracy from 1% to 88% and 6% to 40% respectively.
We also demonstrate tight verification of a quadratic stability specification for the decoder of a variational autoencoder.
arXiv Detail & Related papers (2020-10-22T12:32:29Z) - An Image Enhancing Pattern-based Sparsity for Real-time Inference on
Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.