Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach
- URL: http://arxiv.org/abs/2602.12183v1
- Date: Thu, 12 Feb 2026 17:15:39 GMT
- Title: Unknown Attack Detection in IoT Networks using Large Language Models: A Robust, Data-efficient Approach
- Authors: Shan Ali, Feifei Niu, Paria Shirani, Lionel C. Briand,
- Abstract summary: Existing machine learning approaches rely on large labeled datasets, payload inspection, or closed-set classification.<n>We propose SiamXBERT, a robust and data-efficient Siamese meta-learning framework empowered by a transformer-based language model for unknown attack detection.<n>We show that SiamXBERT consistently outperforms state-of-the-art baselines under both within-dataset and cross-dataset settings.
- Score: 5.0363184281919215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid evolution of cyberattacks continues to drive the emergence of unknown (zero-day) threats, posing significant challenges for network intrusion detection systems in Internet of Things (IoT) networks. Existing machine learning and deep learning approaches typically rely on large labeled datasets, payload inspection, or closed-set classification, limiting their effectiveness under data scarcity, encrypted traffic, and distribution shifts. Consequently, detecting unknown attacks in realistic IoT deployments remains difficult. To address these limitations, we propose SiamXBERT, a robust and data-efficient Siamese meta-learning framework empowered by a transformer-based language model for unknown attack detection. The proposed approach constructs a dual-modality feature representation by integrating flow-level and packet-level information, enabling richer behavioral modeling while remaining compatible with encrypted traffic. Through meta-learning, the model rapidly adapts to new attack types using only a small number of labeled samples and generalizes to previously unseen behaviors. Extensive experiments on representative IoT intrusion datasets demonstrate that SiamXBERT consistently outperforms state-of-the-art baselines under both within-dataset and cross-dataset settings while requiring significantly less training data, achieving up to \num{78.8}\% improvement in unknown F1-score. These results highlight the practicality of SiamXBERT for robust unknown attack detection in real-world IoT environments.
Related papers
- Multi-Agent Collaborative Intrusion Detection for Low-Altitude Economy IoT: An LLM-Enhanced Agentic AI Framework [60.72591149679355]
The rapid expansion of low-altitude economy Internet of Things (LAE-IoT) networks has created unprecedented security challenges.<n>Traditional intrusion detection systems fail to tackle the unique characteristics of aerial IoT environments.<n>We introduce a large language model (LLM)-enabled agentic AI framework for enhancing intrusion detection in LAE-IoT networks.
arXiv Detail & Related papers (2026-01-25T12:47:25Z) - Elevating Intrusion Detection and Security Fortification in Intelligent Networks through Cutting-Edge Machine Learning Paradigms [5.706727902661187]
This study proposes a robust multiclass machine learning based intrusion detection framework.<n>It integrates advanced feature selection techniques to identify critical attributes, mitigating redundancy and enhancing detection accuracy.<n>The proposed ensemble architecture achieves superior performance, with an accuracy of 98%, precision of 98%, recall of 98%, and a false positive rate of just 2%.
arXiv Detail & Related papers (2025-12-22T05:14:26Z) - One-Class Intrusion Detection with Dynamic Graphs [46.453758431767724]
Machine learning-based intrusion detection constitutes a promising approach for improving security.<n>We propose a novel intrusion detection method, TGN-SVDD, which builds upon modern dynamic graph modelling and deep anomaly detection.<n>We demonstrate its superiority over several baselines for realistic intrusion detection data and suggest a more challenging variant of the latter.
arXiv Detail & Related papers (2025-08-18T12:36:55Z) - Hybrid LLM-Enhanced Intrusion Detection for Zero-Day Threats in IoT Networks [6.087274577167399]
This paper presents a novel approach to intrusion detection by integrating traditional signature-based methods with the contextual understanding capabilities of the GPT-2 Large Language Model (LLM)<n>We propose a hybrid IDS framework that merges the robustness of signature-based techniques with the adaptability of GPT-2-driven semantic analysis.<n> Experimental evaluations on a representative intrusion dataset demonstrate that our model enhances detection accuracy by 6.3%, reduces false positives by 9.0%, and maintains near real-time responsiveness.
arXiv Detail & Related papers (2025-07-10T04:10:03Z) - Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems [1.1265248232450553]
This paper proposes a self-supervised contrastive learning approach for generalizable intrusion detection on raw packet sequences.<n>Our framework exhibits better performance in comparison to existing NetFlow self-supervised methods.<n>Our model provides a strong baseline for supervised intrusion detection with limited labeled data.
arXiv Detail & Related papers (2025-05-12T13:42:00Z) - Intelligent IoT Attack Detection Design via ODLLM with Feature Ranking-based Knowledge Base [0.964942474860411]
Internet of Things (IoT) devices have introduced significant cybersecurity challenges.<n>Traditional machine learning (ML) techniques often fall short in detecting such attacks due to the complexity of blended and evolving patterns.<n>We propose a novel framework leveraging On-Device Large Language Models (ODLLMs) augmented with fine-tuning and knowledge base (KB) integration for intelligent IoT network attack detection.
arXiv Detail & Related papers (2025-03-27T16:41:57Z) - A Conditional Tabular GAN-Enhanced Intrusion Detection System for Rare Attacks in IoT Networks [1.1970409518725493]
Internet of things (IoT) networks, boosted by 6G technology, are transforming various industries.<n>Their widespread adoption introduces significant security risks, particularly in detecting rare but potentially damaging cyber-attacks.<n>Traditional IDS often struggle with detecting rare attacks due to severe class imbalances in IoT data.
arXiv Detail & Related papers (2025-02-09T21:13:11Z) - Learning in Multiple Spaces: Few-Shot Network Attack Detection with Metric-Fused Prototypical Networks [47.18575262588692]
We propose a novel Multi-Space Prototypical Learning framework tailored for few-shot attack detection.<n>By leveraging Polyak-averaged prototype generation, the framework stabilizes the learning process and effectively adapts to rare and zero-day attacks.<n> Experimental results on benchmark datasets demonstrate that MSPL outperforms traditional approaches in detecting low-profile and novel attack types.
arXiv Detail & Related papers (2024-12-28T00:09:46Z) - Enhancing Internet of Things Security throughSelf-Supervised Graph Neural Networks [1.0678175996321808]
New types of attacks often have significantly fewer samples than more common attacks, leading to unbalanced datasets.<n>We suggest a new approach to IoT intrusion detection using Self-Supervised Learning (SSL) with a Markov Graph Convolutional Network (MarkovGCN)<n>Our approach leverages the inherent structure of IoT networks to pre-train a GCN, which is then fine-tuned for the intrusion detection task.
arXiv Detail & Related papers (2024-12-17T17:40:14Z) - Lightweight CNN-BiLSTM based Intrusion Detection Systems for Resource-Constrained IoT Devices [38.16309790239142]
Intrusion Detection Systems (IDSs) have played a significant role in detecting and preventing cyber-attacks within traditional computing systems.
The limited computational resources available on Internet of Things (IoT) devices make it challenging to deploy conventional computing-based IDSs.
We propose a hybrid CNN architecture composed of a lightweight CNN and bidirectional LSTM (BiLSTM) to enhance the performance of IDS on the UNSW-NB15 dataset.
arXiv Detail & Related papers (2024-06-04T20:36:21Z) - Redefining DDoS Attack Detection Using A Dual-Space Prototypical Network-Based Approach [38.38311259444761]
We introduce a new deep learning-based technique for detecting DDoS attacks.
We propose a new dual-space prototypical network that leverages a unique dual-space loss function.
This approach capitalizes on the strengths of representation learning within the latent space.
arXiv Detail & Related papers (2024-06-04T03:22:52Z) - Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning [52.6706505729803]
We introduce Federated Learning (FL) to collaboratively train a decentralized shared model of Intrusion Detection Systems (IDS)
FLEKD enables a more flexible aggregation method than conventional model fusion techniques.
Experiment results show that the proposed approach outperforms local training and traditional FL in terms of both speed and performance.
arXiv Detail & Related papers (2024-01-22T14:16:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.