Multi-Stage Retrieval for Operational Technology Cybersecurity Compliance Using Large Language Models: A Railway Casestudy
- URL: http://arxiv.org/abs/2504.14044v1
- Date: Fri, 18 Apr 2025 19:24:17 GMT
- Title: Multi-Stage Retrieval for Operational Technology Cybersecurity Compliance Using Large Language Models: A Railway Casestudy
- Authors: Regan Bolton, Mohammadreza Sheikhfathollahi, Simon Parkinson, Dan Basher, Howard Parkinson,
- Abstract summary: This paper proposes a novel system that leverages Large Language Models (LLMs) and multi-stage retrieval to enhance the compliance verification process.<n>We first evaluate a Baseline Compliance Architecture (BCA) for answering OTCS compliance queries, then develop an extended approach called Parallel Compliance Architecture (PCA)<n>We demonstrate that the PCA significantly improves both correctness and reasoning quality in compliance verification.
- Score: 1.1010026679581653
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Operational Technology Cybersecurity (OTCS) continues to be a dominant challenge for critical infrastructure such as railways. As these systems become increasingly vulnerable to malicious attacks due to digitalization, effective documentation and compliance processes are essential to protect these safety-critical systems. This paper proposes a novel system that leverages Large Language Models (LLMs) and multi-stage retrieval to enhance the compliance verification process against standards like IEC 62443 and the rail-specific IEC 63452. We first evaluate a Baseline Compliance Architecture (BCA) for answering OTCS compliance queries, then develop an extended approach called Parallel Compliance Architecture (PCA) that incorporates additional context from regulatory standards. Through empirical evaluation comparing OpenAI-gpt-4o and Claude-3.5-haiku models in these architectures, we demonstrate that the PCA significantly improves both correctness and reasoning quality in compliance verification. Our research establishes metrics for response correctness, logical reasoning, and hallucination detection, highlighting the strengths and limitations of using LLMs for compliance verification in railway cybersecurity. The results suggest that retrieval-augmented approaches can significantly improve the efficiency and accuracy of compliance assessments, particularly valuable in an industry facing a shortage of cybersecurity expertise.
Related papers
- In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI [93.33036653316591]
We call for three interventions to advance system safety.<n>First, we propose using standardized AI flaw reports and rules of engagement for researchers.<n>Second, we propose GPAI system providers adopt broadly-scoped flaw disclosure programs.<n>Third, we advocate for the development of improved infrastructure to coordinate distribution of flaw reports.
arXiv Detail & Related papers (2025-03-21T05:09:46Z) - Performance Analysis and Industry Deployment of Post-Quantum Cryptography Algorithms [0.8602553195689513]
The National Institute of Standards and Technology (NIST) has selected CRYSTALS-Kyber and CRYSTALS-Dilithium as standardized PQC algorithms for secure key exchange and digital signatures.<n>This study conducts a comprehensive performance analysis of these algorithms by benchmarking execution times across cryptographic operations.<n>Our findings demonstrate that Kyber and Dilithium achieve efficient execution times, outperforming classical cryptographic schemes such as RSA and ECDSA at equivalent security levels.
arXiv Detail & Related papers (2025-03-17T09:06:03Z) - Enforcing Cybersecurity Constraints for LLM-driven Robot Agents for Online Transactions [0.0]
The integration of Large Language Models (LLMs) into autonomous robotic agents for conducting online transactions poses significant cybersecurity challenges.
This study aims to enforce robust cybersecurity constraints to mitigate the risks associated with data breaches, transaction fraud, and system manipulation.
arXiv Detail & Related papers (2025-03-17T01:01:10Z) - Beyond Benchmarks: On The False Promise of AI Regulation [13.125853211532196]
We show that effective scientific regulation requires a causal theory linking observable test outcomes to future performance.<n>We show that deep learning models, which learn complex statistical patterns from training data without explicit causal mechanisms, preclude such guarantees.
arXiv Detail & Related papers (2025-01-26T22:43:07Z) - Adaptive PII Mitigation Framework for Large Language Models [2.694044579874688]
This paper introduces an adaptive system for mitigating risk of Personally Identifiable Information (PII) and Sensitive Personal Information (SPI)<n>The system uses advanced NLP techniques, context-aware analysis, and policy-driven masking to ensure regulatory compliance.<n> Benchmarks highlight the system's effectiveness, with an F1 score of 0.95 for Passport Numbers.
arXiv Detail & Related papers (2025-01-21T19:22:45Z) - The Dual-use Dilemma in LLMs: Do Empowering Ethical Capacities Make a Degraded Utility? [54.18519360412294]
Large Language Models (LLMs) must balance between rejecting harmful requests for safety and accommodating legitimate ones for utility.
This paper presents a Direct Preference Optimization (DPO) based alignment framework that achieves better overall performance.
We analyze experimental results obtained from testing DeepSeek-R1 on our benchmark and reveal the critical ethical concerns raised by this highly acclaimed model.
arXiv Detail & Related papers (2025-01-20T06:35:01Z) - Securing Legacy Communication Networks via Authenticated Cyclic Redundancy Integrity Check [98.34702864029796]
We propose Authenticated Cyclic Redundancy Integrity Check (ACRIC)
ACRIC preserves backward compatibility without requiring additional hardware and is protocol agnostic.
We show that ACRIC offers robust security with minimal transmission overhead ( 1 ms)
arXiv Detail & Related papers (2024-11-21T18:26:05Z) - A Novel Generative AI-Based Framework for Anomaly Detection in Multicast Messages in Smart Grid Communications [0.0]
Cybersecurity breaches in digital substations pose significant challenges to the stability and reliability of power system operations.
This paper proposes a task-oriented dialogue system for anomaly detection (AD) in datasets of multicast messages.
It has a lower potential error and better scalability and adaptability than a process that considers the cybersecurity guidelines recommended by humans.
arXiv Detail & Related papers (2024-06-08T13:28:50Z) - Online Safety Property Collection and Refinement for Safe Deep
Reinforcement Learning in Mapless Navigation [79.89605349842569]
We introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time.
CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties.
We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.
arXiv Detail & Related papers (2023-02-13T21:19:36Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Evaluating the Safety of Deep Reinforcement Learning Models using
Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis.
Our method obtains comparable results over standard benchmarks with respect to formal verifiers.
Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.