Related papers: CyberRAG: An Agentic RAG cyber attack classification and reporting tool

CyberRAG: An Agentic RAG cyber attack classification and reporting tool

URL: http://arxiv.org/abs/2507.02424v2
Date: Wed, 10 Sep 2025 09:08:15 GMT
Title: CyberRAG: An Agentic RAG cyber attack classification and reporting tool
Authors: Francesco Blefari, Cristian Cosentino, Francesco Aurelio Pironti, Angelo Furfaro, Fabrizio Marozzo,
Abstract summary: CyberRAG is a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks.<n>Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning.
Score: 0.3914676152740142
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.

Related papers

CIBER: A Comprehensive Benchmark for Security Evaluation of Code Interpreter Agents [27.35968236632966]
LLM-based code interpreter agents are increasingly deployed in critical situations.<n>Existing benchmarks fail to capture the security risks arising from dynamic code execution, tool interactions, and multi-turn context.<n>We introduce CIBER, an automated benchmark that combines dynamic attack generation, isolated secure sandboxing, and state-aware evaluation.
arXiv Detail & Related papers (2026-02-23T06:41:41Z)
ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack [52.17935054046577]
We present ReasAlign, a model-level solution to improve safety alignment against indirect prompt injection attacks.<n>ReasAlign incorporates structured reasoning steps to analyze user queries, detect conflicting instructions, and preserve the continuity of the user's intended tasks.
arXiv Detail & Related papers (2026-01-15T08:23:38Z)
Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space [4.699272847316498]
Reasoning-Style Poisoning (RSP) manipulates how agents process information rather than what they process.<n>Generative Style Injection (GSI) rewrites retrieved documents into pathological tones.<n>RSP-M is a lightweight runtime monitor that calculates RSV metrics in real-time and triggers alerts when values exceed safety thresholds.
arXiv Detail & Related papers (2025-12-16T14:34:10Z)
The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search [58.8834056209347]
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs.<n>We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base.
arXiv Detail & Related papers (2025-12-01T07:05:23Z)
VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z)
AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning [2.918225266151982]
We present AVIATOR, the first AI-agentic vulnerability injection workflow.<n>It automatically injects realistic, category-specific vulnerabilities for high-fidelity, diverse, large-scale vulnerability dataset generation.<n>It combines semantic analysis, injection synthesis enhanced with LoRA-based fine-tuning and Retrieval-Augmented Generation, as well as post-injection validation via static analysis and LLM-based discriminators.
arXiv Detail & Related papers (2025-08-28T14:59:39Z)
Towards Unifying Quantitative Security Benchmarking for Multi Agent Systems [0.0]
Evolving AI systems increasingly deploy multi-agent architectures where autonomous agents collaborate, share information, and delegate tasks through developing protocols.<n>One such risk is a cascading risk: a breach in one agent can cascade through the system, compromising others by exploiting inter-agent trust.<n>In an ACI attack, a malicious input or tool exploit injected at one agent leads to cascading compromises and amplified downstream effects across agents that trust its outputs.
arXiv Detail & Related papers (2025-07-23T13:51:28Z)
Hybrid LLM-Enhanced Intrusion Detection for Zero-Day Threats in IoT Networks [6.087274577167399]
This paper presents a novel approach to intrusion detection by integrating traditional signature-based methods with the contextual understanding capabilities of the GPT-2 Large Language Model (LLM)<n>We propose a hybrid IDS framework that merges the robustness of signature-based techniques with the adaptability of GPT-2-driven semantic analysis.<n> Experimental evaluations on a representative intrusion dataset demonstrate that our model enhances detection accuracy by 6.3%, reduces false positives by 9.0%, and maintains near real-time responsiveness.
arXiv Detail & Related papers (2025-07-10T04:10:03Z)
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z)
TrustRAG: Enhancing Robustness and Trustworthiness in Retrieval-Augmented Generation [31.231916859341865]
TrustRAG is a framework that systematically filters malicious and irrelevant content before it is retrieved for generation.<n>TrustRAG delivers substantial improvements in retrieval accuracy, efficiency, and attack resistance.
arXiv Detail & Related papers (2025-01-01T15:57:34Z)
LLM-based Continuous Intrusion Detection Framework for Next-Gen Networks [0.7100520098029439]
The framework employs a transformer encoder architecture, which captures hidden patterns in a bidirectional manner to differentiate between malicious and legitimate traffic. The system incrementally identifies unknown attack types by leveraging a Gaussian Mixture Model (GMM) to cluster features derived from high-dimensional BERT embeddings. Even after integrating additional unknown attack clusters, the framework continues to perform at a high level, achieving 95.6% in both classification accuracy and recall.
arXiv Detail & Related papers (2024-11-04T18:12:14Z)
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases [73.04652687616286]
We propose AgentPoison, the first backdoor attack targeting generic and RAG-based LLM agents by poisoning their long-term memory or RAG knowledge base. Unlike conventional backdoor attacks, AgentPoison requires no additional model training or fine-tuning. On each agent, AgentPoison achieves an average attack success rate higher than 80% with minimal impact on benign performance.
arXiv Detail & Related papers (2024-07-17T17:59:47Z)
Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)
FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks. We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness. Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z)
Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception. We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception. We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z)
Generative Adversarial Network-Driven Detection of Adversarial Tasks in Mobile Crowdsensing [5.675436513661266]
Crowdsensing systems are vulnerable to various attacks as they build on non-dedicated and ubiquitous properties. Previous works suggest that GAN-based attacks exhibit more crucial devastation than empirically designed attack samples. This paper aims to detect intelligently designed illegitimate sensing service requests by integrating a GAN-based model.
arXiv Detail & Related papers (2022-02-16T00:23:25Z)
SAGE: Intrusion Alert-driven Attack Graph Extractor [4.530678016396476]
Attack graphs (AGs) are used to assess pathways availed by cyber adversaries to penetrate a network. We propose to automatically learn AGs based on actions observed through intrusion alerts, without prior expert knowledge.
arXiv Detail & Related papers (2021-07-06T17:45:02Z)
Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence [94.94833077653998]
ThreatRaptor is a system that facilitates threat hunting in computer systems using open-source Cyber Threat Intelligence (OSCTI) It extracts structured threat behaviors from unstructured OSCTI text and uses a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities. Evaluations on a broad set of attack cases demonstrate the accuracy and efficiency of ThreatRaptor in practical threat hunting.
arXiv Detail & Related papers (2020-10-26T14:54:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.