Related papers: AgentCyTE: Leveraging Agentic AI to Generate Cybersecurity Training & Experimentation Scenarios

AgentCyTE: Leveraging Agentic AI to Generate Cybersecurity Training & Experimentation Scenarios

URL: http://arxiv.org/abs/2510.25189v1
Date: Wed, 29 Oct 2025 05:44:12 GMT
Title: AgentCyTE: Leveraging Agentic AI to Generate Cybersecurity Training & Experimentation Scenarios
Authors: Ana M. Rodriguez, Jaime Acosta, Anantaa Kotal, Aritran Piplai,
Abstract summary: We present AgentCyTE, a framework integrating large language models with deterministic, schema-constrained network emulation.<n>AgentCyTE observes scenario outcomes, validates correctness, and iteratively enhances realism and consistency.
Score: 0.19999259391104388
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Designing realistic and adaptive networked threat scenarios remains a core challenge in cybersecurity research and training, still requiring substantial manual effort. While large language models (LLMs) show promise for automated synthesis, unconstrained generation often yields configurations that fail validation or execution. We present AgentCyTE, a framework integrating LLM-based reasoning with deterministic, schema-constrained network emulation to generate and refine executable threat environments. Through an agentic feedback loop, AgentCyTE observes scenario outcomes, validates correctness, and iteratively enhances realism and consistency. This hybrid approach preserves LLM flexibility while enforcing structural validity, enabling scalable, data-driven experimentation and reliable scenario generation for threat modeling and adaptive cybersecurity training. Our framework can be accessed at: https://github.com/AnantaaKotal/AgentCyTE

Related papers

Execution-State-Aware LLM Reasoning for Automated Proof-of-Vulnerability Generation [36.950993500170014]
We present DrillAgent, an agentic framework that reformulates PoV generation as an iterative hypothesis-verification-refinement process.<n>We evaluate DrillAgent on SEC-bench, a large-scale benchmark of real-world C/C++ vulnerabilities.
arXiv Detail & Related papers (2026-02-14T03:17:27Z)
ARTIS: Agentic Risk-Aware Test-Time Scaling via Iterative Simulation [72.78362530982109]
ARTIS, Agentic Risk-Aware Test-Time Scaling via Iterative Simulation, is a framework that decouples exploration from commitment.<n>We show that naive LLM-based simulators struggle to capture rare but high-impact failure modes.<n>We introduce a risk-aware tool simulator that emphasizes fidelity on failure-inducing actions.
arXiv Detail & Related papers (2026-02-02T06:33:22Z)
ComAgent: Multi-LLM based Agentic AI Empowered Intelligent Wireless Networks [62.031889234230725]
6G networks rely on complex cross-layer optimization.<n> manually translating high-level intents into mathematical formulations remains a bottleneck.<n>We present ComAgent, a multi-LLM agentic AI framework.
arXiv Detail & Related papers (2026-01-27T13:43:59Z)
From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning [81.97788535387286]
We propose a framework that internalizes the agentic verification-and-editing mechanism into a unified, single-pass inference process.<n>With minimal data, SRI-Coder enables Chat models to surpass the completion performance of their Base counterparts.<n>Unlike FIM-style tuning, SRI preserves general coding competencies and maintains inference latency comparable to standard FIM.
arXiv Detail & Related papers (2026-01-19T20:33:53Z)
CyberLLM-FINDS 2025: Instruction-Tuned Fine-tuning of Domain-Specific LLMs with Retrieval-Augmented Generation and Graph Integration for MITRE Evaluation [0.054619385369457214]
This work presents a methodology to fine-tune the Gemma-2B model into a domain-specific cybersecurity LLM.<n>We detail the processes of dataset preparation, fine-tuning, and synthetic data generation, along with implications for real-world applications in threat detection, forensic investigation, and attack analysis.
arXiv Detail & Related papers (2026-01-11T05:07:57Z)
Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation [0.0]
The current paper includes an example of agentic artificial intelligence (AI) based on autonomous software supply chain security.<n>It combines large language model (LLM)-based reasoning, reinforcement learning (RL), and multi-agent coordination.<n>Results show that agentic AI can facilitate the transition to self defending, proactive software supply chains.
arXiv Detail & Related papers (2025-12-29T14:06:09Z)
The Evolution of Agentic AI in Cybersecurity: From Single LLM Reasoners to Multi-Agent Systems and Autonomous Pipelines [0.0]
Cybersecurity has become one of the earliest adopters of agentic AI.<n>This survey presents a five-generation taxonomy of agentic AI in cybersecurity.
arXiv Detail & Related papers (2025-12-07T05:10:16Z)
Adaptive Cybersecurity Architecture for Digital Product Ecosystems Using Agentic AI [0.0]
This study introduces autonomous goal driven agents capable of dynamic learning and context-aware decision making.<n> Behavioral baselining, decentralized risk scoring, and federated threat intelligence sharing are important features.<n>The architecture provides an intelligent and scalable blueprint for safeguarding complex digital infrastructure.
arXiv Detail & Related papers (2025-09-25T00:43:53Z)
Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection [108.5042835056188]
This work introduces Agent4FaceForgery to address two fundamental problems.<n>How to capture the diverse intents and iterative processes of human forgery creation.<n>How to model the complex, often adversarial, text-image interactions that accompany forgeries in social media.
arXiv Detail & Related papers (2025-09-16T01:05:01Z)
Advancing Autonomous Incident Response: Leveraging LLMs and Cyber Threat Intelligence [3.2284427438223013]
Security teams are overwhelmed by alert fatigue, high false-positive rates, and the vast volume of unstructured Cyber Threat Intelligence (CTI) documents.<n>We introduce a novel Retrieval-Augmented Generation (RAG)-based framework that leverages Large Language Models (LLMs) to automate and enhance IR.<n>Our approach introduces a hybrid retrieval mechanism that combines NLP-based similarity searches within a CTI vector database with standardized queries to external CTI platforms.
arXiv Detail & Related papers (2025-08-14T14:20:34Z)
Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection [38.083049237330826]
This study explores the use of Large Language Models (LLMs) in software vulnerability assessment by simulating the identification of Python code with known Common Weaknessions (CWEs)<n>Our results indicate that while zero-shot prompting performs poorly, few-shot prompting significantly enhances classification performance.<n> challenges such as model reliability, interpretability, and adversarial robustness remain critical areas for future research.
arXiv Detail & Related papers (2025-06-11T18:43:51Z)
AgentSGEN: Multi-Agent LLM in the Loop for Semantic Collaboration and GENeration of Synthetic Data [3.3186271052113843]
scarcity of data presents a major obstacle to training AI systems for safety-critical applications, such as construction safety.<n>We propose a novel multi-agent framework that employs an iterative, in-the-loop collaboration between two agents.<n> powered by LLM's capabilities to reasoning and common-sense knowledge, this collaborative design produces synthetic images tailored to safety-critical scenarios.
arXiv Detail & Related papers (2025-05-07T22:43:33Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework. Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations. We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z)
RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios [58.62407014256686]
RealGen is a novel retrieval-based in-context learning framework for traffic scenario generation. RealGen synthesizes new scenarios by combining behaviors from multiple retrieved examples in a gradient-free way. This in-context learning framework endows versatile generative capabilities, including the ability to edit scenarios.
arXiv Detail & Related papers (2023-12-19T23:11:06Z)
Realistic simulation of users for IT systems in cyber ranges [63.20765930558542]
We instrument each machine by means of an external agent to generate user activity. This agent combines both deterministic and deep learning based methods to adapt to different environment. We also propose conditional text generation models to facilitate the creation of conversations and documents.
arXiv Detail & Related papers (2021-11-23T10:53:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.