Related papers: LLM-driven Provenance Forensics for Threat Investigation and Detection

LLM-driven Provenance Forensics for Threat Investigation and Detection

URL: http://arxiv.org/abs/2508.21323v1
Date: Fri, 29 Aug 2025 04:39:52 GMT
Title: LLM-driven Provenance Forensics for Threat Investigation and Detection
Authors: Kunal Mukherjee, Murat Kantarcioglu,
Abstract summary: PROVSEEK is an agentic framework for provenance-driven forensic analysis and threat intelligence extraction.<n>It generates precise, context-aware queries that fuse a vectorized threat report knowledge base with data from system provenance databases.<n>It resolves provenance queries, orchestrates multiple role-specific agents to mitigate hallucinations, and synthesizes structured, ground-truth verifiable forensic summaries.
Score: 12.388936704058521
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We introduce PROVSEEK, an LLM-powered agentic framework for automated provenance-driven forensic analysis and threat intelligence extraction. PROVSEEK employs specialized toolchains to dynamically retrieve relevant context by generating precise, context-aware queries that fuse a vectorized threat report knowledge base with data from system provenance databases. The framework resolves provenance queries, orchestrates multiple role-specific agents to mitigate hallucinations, and synthesizes structured, ground-truth verifiable forensic summaries. By combining agent orchestration with Retrieval-Augmented Generation (RAG) and chain-of-thought (CoT) reasoning, PROVSEEK enables adaptive multi-step analysis that iteratively refines hypotheses, verifies supporting evidence, and produces scalable, interpretable forensic explanations of attack behaviors. By combining provenance data with agentic reasoning, PROVSEEK establishes a new paradigm for grounded agentic forecics to investigate APTs. We conduct a comprehensive evaluation on publicly available DARPA datasets, demonstrating that PROVSEEK outperforms retrieval-based methods for intelligence extraction task, achieving a 34% improvement in contextual precision/recall; and for threat detection task, PROVSEEK achieves 22%/29% higher precision/recall compared to both a baseline agentic AI approach and State-Of-The-Art (SOTA) Provenance-based Intrusion Detection System (PIDS).

Related papers

The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis [24.51410516475904]
This SoK presents a comprehensive overview of the Prompt Injection (PI) landscape, covering attacks, defenses, and their evaluation practices.<n>We introduce AgentPI, a new benchmark designed to systematically evaluate agent behavior under context-dependent interaction settings.<n>We show that many defenses appear effective under existing benchmarks by suppressing contextual inputs, yet fail to generalize to realistic agent settings where context-dependent reasoning is essential.
arXiv Detail & Related papers (2026-02-11T02:47:10Z)
Multi-Agent Collaborative Intrusion Detection for Low-Altitude Economy IoT: An LLM-Enhanced Agentic AI Framework [60.72591149679355]
The rapid expansion of low-altitude economy Internet of Things (LAE-IoT) networks has created unprecedented security challenges.<n>Traditional intrusion detection systems fail to tackle the unique characteristics of aerial IoT environments.<n>We introduce a large language model (LLM)-enabled agentic AI framework for enhancing intrusion detection in LAE-IoT networks.
arXiv Detail & Related papers (2026-01-25T12:47:25Z)
Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification [71.98473277917962]
Recent advances in Deep Research Agents (DRAs) are transforming automated knowledge discovery and problem-solving.<n>We propose an alternative paradigm: self-evolving the agent's ability by iteratively verifying the policy model's outputs, guided by meticulously crafted rubrics.<n>We present DeepVerifier, a rubrics-based outcome reward verifier that leverages the asymmetry of verification.
arXiv Detail & Related papers (2026-01-22T09:47:31Z)
Automated Post-Incident Policy Gap Analysis via Threat-Informed Evidence Mapping using Large Language Models [0.0]
This paper investigates whether Large Language Models (LLMs) can augment post-incident review by autonomously analysing system evidence and identifying security policy gaps.<n>We present a threat-informed, agentic framework that ingests log data, maps observed behaviours to the MITRE ATT&CK framework, and evaluates organisational security policies for adequacy and compliance.
arXiv Detail & Related papers (2026-01-04T01:39:20Z)
Multi-hop Reasoning via Early Knowledge Alignment [68.28168992785896]
Early Knowledge Alignment (EKA) aims to align Large Language Models with contextually relevant retrieved knowledge.<n>EKA significantly improves retrieval precision, reduces cascading errors, and enhances both performance and efficiency.<n>EKA proves effective as a versatile, training-free inference strategy that scales seamlessly to large models.
arXiv Detail & Related papers (2025-12-23T08:14:44Z)
Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection [59.04089915447622]
ForenAgent is an interactive IFD framework that enables MLLMs to autonomously generate, execute, and refine Python-based low-level tools around the detection objective.<n>Inspired by human reasoning, we design a dynamic reasoning loop comprising global perception, local focusing, iterative probing, and holistic adjudication.<n>Experiments show that ForenAgent exhibits emergent tool-use competence and reflective reasoning on challenging IFD tasks.
arXiv Detail & Related papers (2025-12-18T08:38:44Z)
CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage [10.088447487211893]
Security Operations Centers (SOCs) are overwhelmed by tens of thousands of daily alerts.<n>This overload creates alert fatigue, leading to overlooked threats and analyst burnout.<n>We propose CORTEX, a multi-agent LLM architecture for high-stakes alert triage.
arXiv Detail & Related papers (2025-09-30T22:09:31Z)
AgenticIQA: An Agentic Framework for Adaptive and Interpretable Image Quality Assessment [69.06977852423564]
Image quality assessment (IQA) reflects both the quantification and interpretation of perceptual quality rooted in the human visual system.<n>AgenticIQA decomposes IQA into four subtasks -- distortion detection, distortion analysis, tool selection, and tool execution.<n>To support training and evaluation, we introduce AgenticIQA-200K, a large-scale instruction dataset tailored for IQA agents, and AgenticIQA-Eval, the first benchmark for assessing the planning, execution, and summarization capabilities of VLM-based IQA agents.
arXiv Detail & Related papers (2025-09-30T09:37:01Z)
Propose and Rectify: A Forensics-Driven MLLM Framework for Image Manipulation Localization [49.71303998618939]
This paper presents a novel Propose-Rectify framework that bridges semantic reasoning with forensic-specific analysis.<n>Our framework ensures that initial semantic proposals are systematically validated and enhanced through concrete technical evidence, resulting in comprehensive detection accuracy and localization precision.
arXiv Detail & Related papers (2025-08-25T12:43:53Z)
MCP-Orchestrated Multi-Agent System for Automated Disinformation Detection [84.75972919995398]
This paper presents a multi-agent system that uses relation extraction to detect disinformation in news articles.<n>The proposed Agentic AI system combines four agents: (i) a machine learning agent (logistic regression), (ii) a Wikipedia knowledge check agent, and (iv) a web-scraped data analyzer.<n>Results demonstrate that the multi-agent ensemble achieves 95.3% accuracy with an F1 score of 0.964, significantly outperforming individual agents and traditional approaches.
arXiv Detail & Related papers (2025-08-13T19:14:48Z)
Deep Research Agents: A Systematic Examination And Roadmap [79.04813794804377]
Deep Research (DR) agents are designed to tackle complex, multi-turn informational research tasks.<n>In this paper, we conduct a detailed analysis of the foundational technologies and architectural components that constitute DR agents.
arXiv Detail & Related papers (2025-06-22T16:52:48Z)
AURA: A Multi-Agent Intelligence Framework for Knowledge-Enhanced Cyber Threat Attribution [3.6586145148601594]
AURA (Attribution Using Retrieval-Augmented Agents) is a knowledge-enhanced framework for automated and interpretable APT attribution.<n>AURA ingests diverse threat data including Tactics, Techniques, and Procedures (TTPs), Indicators of Compromise (IoCs), malware details, adversarial tools, and temporal information.
arXiv Detail & Related papers (2025-06-11T21:00:51Z)
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning [43.66966457772646]
MA-RAG orchestrates a collaborative set of specialized AI agents to tackle each stage of the RAG pipeline with task-aware reasoning.<n>Our design allows fine-grained control over information flow without any model fine-tuning.<n>This modular and reasoning-driven architecture enables MA-RAG to deliver robust, interpretable results.
arXiv Detail & Related papers (2025-05-26T15:05:18Z)
InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation [63.55258191625131]
InfoDeepSeek is a new benchmark for assessing agentic information seeking in real-world, dynamic web environments.<n>We propose a systematic methodology for constructing challenging queries satisfying the criteria of determinacy, difficulty, and diversity.<n>We develop the first evaluation framework tailored to dynamic agentic information seeking, including fine-grained metrics about the accuracy, utility, and compactness of information seeking outcomes.
arXiv Detail & Related papers (2025-05-21T14:44:40Z)
Cybersecurity threat detection based on a UEBA framework using Deep Autoencoders [0.0]
We introduce the first implementation of an explainable UEBA-based anomaly detection framework.<n>Based on the theoretical foundations of neural networks, we offer a novel proof demonstrating the equivalence of two widely used definitions for fully-connected neural networks.<n>Our findings suggest that the proposed UEBA framework can be seamlessly integrated into enterprise environments.
arXiv Detail & Related papers (2025-05-14T13:18:12Z)
Exploring Answer Set Programming for Provenance Graph-Based Cyber Threat Detection: A Novel Approach [4.302577059401172]
Provenance graphs are useful tools for representing system-level activities in cybersecurity.<n>This paper presents a novel approach using ASP to model and analyze provenance graphs.
arXiv Detail & Related papers (2025-01-24T14:57:27Z)
GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models [0.08192907805418582]
Cyber timeline analysis is crucial in Digital Forensics and Incident Response (DFIR)<n>Traditional methods rely on structured artefacts, such as logs and metadata, for evidence identification and feature extraction.<n>This paper introduces GenDFIR, a framework leveraging large language models (LLMs), specifically Llama 3.1 8B in zero shot mode, integrated with a Retrieval-Augmented Generation (RAG) agent.
arXiv Detail & Related papers (2024-09-04T09:46:33Z)
SAIS: Supervising and Augmenting Intermediate Steps for Document-Level Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction. Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.