Enhancing Cybersecurity in Critical Infrastructure with LLM-Assisted Explainable IoT Systems
- URL: http://arxiv.org/abs/2503.03180v1
- Date: Wed, 05 Mar 2025 04:53:07 GMT
- Title: Enhancing Cybersecurity in Critical Infrastructure with LLM-Assisted Explainable IoT Systems
- Authors: Ashutosh Ghimire, Ghazal Ghajari, Karma Gurung, Love K. Sah, Fathi Amsaad,
- Abstract summary: This paper presents a hybrid framework that combines numerical anomaly detection using Autoencoders with Large Language Models (LLMs) for enhanced preprocessing and interpretability.<n> Experimental results on the KDDCup99 10% corrected dataset demonstrate that the LLM-assisted preprocessing pipeline significantly improves anomaly detection performance.
- Score: 0.22369578015657962
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensuring the security of critical infrastructure has become increasingly vital with the proliferation of Internet of Things (IoT) systems. However, the heterogeneous nature of IoT data and the lack of human-comprehensible insights from anomaly detection models remain significant challenges. This paper presents a hybrid framework that combines numerical anomaly detection using Autoencoders with Large Language Models (LLMs) for enhanced preprocessing and interpretability. Two preprocessing approaches are implemented: a traditional method utilizing Principal Component Analysis (PCA) to reduce dimensionality and an LLM-assisted method where GPT-4 dynamically recommends feature selection, transformation, and encoding strategies. Experimental results on the KDDCup99 10% corrected dataset demonstrate that the LLM-assisted preprocessing pipeline significantly improves anomaly detection performance. The macro-average F1 score increased from 0.49 in the traditional PCA-based approach to 0.98 with LLM-driven insights. Additionally, the LLM generates natural language explanations for detected anomalies, providing contextual insights into their causes and implications. This framework highlights the synergy between numerical AI models and LLMs, delivering an accurate, interpretable, and efficient solution for IoT cybersecurity in critical infrastructure.
Related papers
- Phishing Detection in the Gen-AI Era: Quantized LLMs vs Classical Models [1.4999444543328293]
Phishing attacks are becoming increasingly sophisticated, underscoring the need for detection systems that strike a balance between high accuracy and computational efficiency.<n>This paper presents a comparative evaluation of traditional Machine Learning (ML), Deep Learning (DL), and quantized small- parameter Large Language Models (LLMs) for phishing detection.<n>We show that while LLMs currently underperform compared to ML and DL methods in terms of raw accuracy, they exhibit strong potential for identifying subtle, context-based phishing cues.
arXiv Detail & Related papers (2025-07-10T04:01:52Z) - Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z) - A Multi-Step Comparative Framework for Anomaly Detection in IoT Data Streams [0.9208007322096533]
Internet of Things (IoT) devices have introduced critical security challenges, underscoring the need for accurate anomaly detection.<n>This paper presents a multi-step evaluation framework assessing the combined impact of preprocessing choices on three machine learning algorithms.<n> Experiments on the IoTID20 dataset shows that GBoosting consistently delivers superior accuracy across preprocessing configurations.
arXiv Detail & Related papers (2025-05-22T16:28:22Z) - SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation [12.838593066237452]
Large language models (LLMs) memorize frequently sensitive information during training, posing risks when deploying publicly accessible models.
This paper presents our solution to SemEval-2025 Task 4 on targeted unlearning, which combines causal mediation analysis with layer-specific optimization.
arXiv Detail & Related papers (2025-04-17T15:05:40Z) - Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models [55.46269953415811]
We identify ToM-sensitive parameters and show that perturbing as little as 0.001% of these parameters significantly degrades ToM performance.
Our results have implications for enhancing model alignment, mitigating biases, and improving AI systems designed for human interaction.
arXiv Detail & Related papers (2025-04-05T17:45:42Z) - APT-LLM: Embedding-Based Anomaly Detection of Cyber Advanced Persistent Threats Using Large Language Models [4.956245032674048]
APTs pose a major cybersecurity challenge due to their stealth and ability to mimic normal system behavior.<n>This paper introduces APT-LLM, a novel embedding-based anomaly detection framework.<n>It integrates large language models (LLMs) with autoencoder architectures to detect APTs.
arXiv Detail & Related papers (2025-02-13T15:01:18Z) - INVARLLM: LLM-assisted Physical Invariant Extraction for Cyber-Physical Systems Anomaly Detection [13.192308838452927]
Cyber-Physical Systems (CPS) are vulnerable to cyber-physical attacks that violate physical laws.<n>We propose a hybrid framework that uses large language models (LLMs) to extract semantic information from CPS documentation and generate physical invariants.<n>This approach combines LLM semantic understanding with empirical validation to ensure both interpretability and reliability.
arXiv Detail & Related papers (2024-11-17T00:09:04Z) - Intent Detection in the Age of LLMs [3.755082744150185]
Intent detection is a critical component of task-oriented dialogue systems (TODS)
Traditional approaches relied on computationally efficient supervised sentence transformer encoder models.
The emergence of generative large language models (LLMs) with intrinsic world knowledge presents new opportunities to address these challenges.
arXiv Detail & Related papers (2024-10-02T15:01:55Z) - R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML)
A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming.
This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding.
A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z) - Security Vulnerability Detection with Multitask Self-Instructed Fine-Tuning of Large Language Models [8.167614500821223]
We introduce MSIVD, multitask self-instructed fine-tuning for vulnerability detection, inspired by chain-of-thought prompting and LLM self-instruction.
Our experiments demonstrate that MSIVD achieves superior performance, outperforming the highest LLM-based vulnerability detector baseline (LineVul) with a F1 score of 0.92 on the BigVul dataset, and 0.48 on the PreciseBugs dataset.
arXiv Detail & Related papers (2024-06-09T19:18:05Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - X-CBA: Explainability Aided CatBoosted Anomal-E for Intrusion Detection System [2.556190321164248]
Using machine learning (ML) and deep learning (DL) models in Intrusion Detection Systems has led to a trust deficit due to their non-transparent decision-making.
This paper introduces a novel Explainable IDS approach, called X-CBA, that leverages the structural advantages of Graph Neural Networks (GNNs) to effectively process network traffic data.
Our approach achieves high accuracy with 99.47% in threat detection and provides clear, actionable explanations of its analytical outcomes.
arXiv Detail & Related papers (2024-02-01T18:29:16Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with
Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications.
We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS)
Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.