Related papers: Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices

Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices

URL: http://arxiv.org/abs/2306.14263v2
Date: Thu, 8 Feb 2024 07:06:39 GMT
Title: Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices
Authors: Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C. Cordeiro, Merouane Debbah, Thierry Lestable, Narinderjit Singh Thandi
Abstract summary: This paper presents SecurityBERT, a novel architecture that leverages the Bidirectional Representations from Transformers (BERT) model for cyber threat detection in IoT networks. Our research demonstrates that SecurityBERT outperforms traditional Machine Learning (ML) and Deep Learning (DL) methods, such as Convolutional Neural Networks (CNNIoTs) or Recurrent Neural Networks (IoTRNNs) in cyber threat detection. SecurityBERT achieved an impressive 98.2% overall accuracy in identifying fourteen distinct attack types, surpassing previous records set by hybrid solutions.
Score: 3.340416780217405
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The field of Natural Language Processing (NLP) is currently undergoing a revolutionary transformation driven by the power of pre-trained Large Language Models (LLMs) based on groundbreaking Transformer architectures. As the frequency and diversity of cybersecurity attacks continue to rise, the importance of incident detection has significantly increased. IoT devices are expanding rapidly, resulting in a growing need for efficient techniques to autonomously identify network-based attacks in IoT networks with both high precision and minimal computational requirements. This paper presents SecurityBERT, a novel architecture that leverages the Bidirectional Encoder Representations from Transformers (BERT) model for cyber threat detection in IoT networks. During the training of SecurityBERT, we incorporated a novel privacy-preserving encoding technique called Privacy-Preserving Fixed-Length Encoding (PPFLE). We effectively represented network traffic data in a structured format by combining PPFLE with the Byte-level Byte-Pair Encoder (BBPE) Tokenizer. Our research demonstrates that SecurityBERT outperforms traditional Machine Learning (ML) and Deep Learning (DL) methods, such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), in cyber threat detection. Employing the Edge-IIoTset cybersecurity dataset, our experimental analysis shows that SecurityBERT achieved an impressive 98.2% overall accuracy in identifying fourteen distinct attack types, surpassing previous records set by hybrid solutions such as GAN-Transformer-based architectures and CNN-LSTM models. With an inference time of less than 0.15 seconds on an average CPU and a compact model size of just 16.7MB, SecurityBERT is ideally suited for real-life traffic analysis and a suitable choice for deployment on resource-constrained IoT devices.

Related papers

MindFlow: A Network Traffic Anomaly Detection Model Based on MindSpore [7.564738687560689]
This study proposes MindFlow, a multi-dimensional dynamic traffic prediction and anomaly detection system. The proposed model achieves 99% in key metrics such as accuracy, precision, recall and F1 score.
arXiv Detail & Related papers (2025-04-24T15:48:02Z)
Edge AI-based Radio Frequency Fingerprinting for IoT Networks [0.0]
cryptography can often be resource-intensive for small-footprint resource-constrained (i.e., IoT) devices. Radio Frequency Fingerprinting (RFF) offers a promising authentication alternative without resorting to cryptographic solutions. We introduce two truly lightweight Edge AI-based RFF schemes tailored for resource-constrained devices.
arXiv Detail & Related papers (2024-12-13T20:55:10Z)
HDI-Former: Hybrid Dynamic Interaction ANN-SNN Transformer for Object Detection Using Frames and Events [44.20745133222306]
HDI-Former is a Hybrid Dynamic Interaction ANN-SNN Transformer for high-accuracy and energy-efficient object detection. We first present a novel semantic-enhanced self-attention mechanism that strengthens the correlation between image encoding tokens within the ANN Transformer branch. We then design a Spiking Swin Transformer branch to model temporal cues from event streams with low power consumption.
arXiv Detail & Related papers (2024-11-27T09:32:41Z)
CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats. Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction. We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z)
Beyond Detection: Leveraging Large Language Models for Cyber Attack Prediction in IoT Networks [4.836070911511429]
This paper proposes a novel network intrusion prediction framework that combines Large Language Models (LLMs) with Long Short Term Memory (LSTM) networks. Our framework, evaluated on the CICIoT2023 IoT attack dataset, demonstrates a significant improvement in predictive capabilities, achieving an overall accuracy of 98%.
arXiv Detail & Related papers (2024-08-26T06:57:22Z)
Efficient Intrusion Detection: Combining $χ^2$ Feature Selection with CNN-BiLSTM on the UNSW-NB15 Dataset [2.239394800147746]
Intrusion Detection Systems (IDSs) have played a significant role in the detection and prevention of cyber-attacks in traditional computing systems. The limited computational resources available on Internet of Things (IoT) devices pose a challenge for deploying conventional computing-based IDSs. We present an effective IDS model that addresses this issue by combining a lightweight Convolutional Neural Network (CNN) with bidirectional Long Short-Term Memory (BiLSTM)
arXiv Detail & Related papers (2024-07-20T17:41:16Z)
A Cutting-Edge Deep Learning Method For Enhancing IoT Security [0.0]
This paper proposes an innovative design of the Internet of Things (IoT) Environment Intrusion Detection System (or IDS) using Deep Learning-integrated Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. Our model, based on the CICIDS 2017 dataset, achieved an accuracy of 99.52% in classifying network traffic as either benign or malicious.
arXiv Detail & Related papers (2024-06-18T08:42:51Z)
Lightweight CNN-BiLSTM based Intrusion Detection Systems for Resource-Constrained IoT Devices [38.16309790239142]
Intrusion Detection Systems (IDSs) have played a significant role in detecting and preventing cyber-attacks within traditional computing systems. The limited computational resources available on Internet of Things (IoT) devices make it challenging to deploy conventional computing-based IDSs. We propose a hybrid CNN architecture composed of a lightweight CNN and bidirectional LSTM (BiLSTM) to enhance the performance of IDS on the UNSW-NB15 dataset.
arXiv Detail & Related papers (2024-06-04T20:36:21Z)
TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture. To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer. In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z)
TinyAD: Memory-efficient anomaly detection for time series data in Industrial IoT [43.207210990362825]
We propose a novel framework named Tiny Anomaly Detection (TinyAD) to efficiently facilitate onboard inference of CNNs for real-time anomaly detection. To reduce the peak memory consumption of CNNs, we explore two complementary strategies, in-place, and patch-by-patch memory rescheduling. Our framework can reduce peak memory consumption by 2-5x with negligible overhead.
arXiv Detail & Related papers (2023-03-07T02:56:15Z)
Semantic Communication Enabling Robust Edge Intelligence for Time-Critical IoT Applications [87.05763097471487]
This paper aims to design robust Edge Intelligence using semantic communication for time-critical IoT applications. We analyze the effect of image DCT coefficients on inference accuracy and propose the channel-agnostic effectiveness encoding for offloading.
arXiv Detail & Related papers (2022-11-24T20:13:17Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Semi-supervised Variational Temporal Convolutional Network for IoT Communication Multi-anomaly Detection [3.3659034873495632]
Internet of Things (IoT) devices are constructed to build a huge communications network. These devices are insecure in reality, it means that the communications network are exposed by the attacker. In this paper, we propose SS-VTCN, a semi-supervised network for IoT multiple anomaly detection.
arXiv Detail & Related papers (2021-04-05T08:51:24Z)
Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming [97.40955121478716]
We propose a first-order dual SDP algorithm that requires memory only linear in the total number of network activations. We significantly improve L-inf verified robust accuracy from 1% to 88% and 6% to 40% respectively. We also demonstrate tight verification of a quadratic stability specification for the decoder of a variational autoencoder.
arXiv Detail & Related papers (2020-10-22T12:32:29Z)
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.