Related papers: VIGIL: A Reflective Runtime for Self-Healing Agents

VIGIL: A Reflective Runtime for Self-Healing Agents

URL: http://arxiv.org/abs/2512.07094v2
Date: Tue, 09 Dec 2025 05:33:35 GMT
Title: VIGIL: A Reflective Runtime for Self-Healing Agents
Authors: Christopher Cruz,
Abstract summary: We present VIGIL, a reflective runtime that supervises a sibling agent and performs autonomous maintenance rather than task execution.<n> VIGIL ingests behavioral logs, appraises each event into a structured emotional representation, maintains a persistent EmoBank with decay and contextual policies, and derives an RBT diagnosis that sorts recent behavior into strengths, opportunities, and failures.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic LLM frameworks promise autonomous behavior via task decomposition, tool use, and iterative planning, but most deployed systems remain brittle. They lack runtime introspection, cannot diagnose their own failure modes, and do not improve over time without human intervention. In practice, many agent stacks degrade into decorated chains of LLM calls with no structural mechanisms for reliability. We present VIGIL (Verifiable Inspection and Guarded Iterative Learning), a reflective runtime that supervises a sibling agent and performs autonomous maintenance rather than task execution. VIGIL ingests behavioral logs, appraises each event into a structured emotional representation, maintains a persistent EmoBank with decay and contextual policies, and derives an RBT diagnosis that sorts recent behavior into strengths, opportunities, and failures. From this analysis, VIGIL generates both guarded prompt updates that preserve core identity semantics and read only code proposals produced by a strategy engine that operates on log evidence and code hotspots. VIGIL functions as a state gated pipeline. Illegal transitions produce explicit errors rather than allowing the LLM to improvise. In a reminder latency case study, VIGIL identified elevated lag, proposed prompt and code repairs, and when its own diagnostic tool failed due to a schema conflict, it surfaced the internal error, produced a fallback diagnosis, and emitted a repair plan. This demonstrates meta level self repair in a deployed agent runtime.

Related papers

AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification [25.817251923574286]
We propose a novel inference-time detection and mitigation framework for large language model (LLM) agents.<n>AgentSentry is the first inference-time defense to model multi-turn IPI as a temporal causal takeover.<n>We evaluate AgentSentry on the textscAgentDojo benchmark across four task suites, three IPI attack families, and multiple black-box LLMs.
arXiv Detail & Related papers (2026-02-26T07:59:10Z)
AkiraRust: Re-thinking LLM-aided Rust Repair Using a Feedback-guided Thinking Switch [25.65238229037917]
AkiraRust is a repair and verification framework that incorporates a finite-state machine to adapt its detection and repair flow to runtime semantic conditions.<n>AkiruRust achieves about 92% semantic correctness and delivers a 2.2x average speedup compared to SOTA.
arXiv Detail & Related papers (2026-02-25T08:34:27Z)
Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections [57.64370755825839]
Self-evolving agents update their internal state across sessions, often by writing and reusing long-term memory.<n>We study this risk and formalize a persistent attack we call a Zombie Agent.<n>We present a black-box attack framework that uses only indirect exposure through attacker-controlled web content.
arXiv Detail & Related papers (2026-02-17T15:28:24Z)
MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval [5.734678752740074]
MemoryGraft is a novel indirect injection attack that compromises agent behavior not through immediate jailbreaks, but by implanting malicious successful experiences into the agent's long-term memory.<n>We demonstrate that an attacker who can supply benign ingestion-level artifacts that the agent reads during execution can induce it to construct a poisoned RAG store.<n>When the agent later encounters semantically similar tasks, union retrieval over lexical templates and embedding similarity reliably surfaces these grafted memories, and the agent adopts the embedded unsafe patterns, leading to persistent behavioral drift across sessions.
arXiv Detail & Related papers (2025-12-18T08:34:40Z)
DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems [50.68504641209259]
DoVer is an intervention-driven debug framework for large language model (LLM)-based multi-agent systems.<n>It augments hypothesis generation with active verification through targeted interventions.<n>DoVer flips 18-28% of failed trials into successes, achieves up to 16% milestone progress, and validates or refutes 30-60% of failure hypotheses.
arXiv Detail & Related papers (2025-12-07T09:23:48Z)
The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search [58.8834056209347]
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs.<n>We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base.
arXiv Detail & Related papers (2025-12-01T07:05:23Z)
InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration [71.18377595277018]
Large Language Models (LLMs) frequently generate buggy code with complex logic errors that are challenging to diagnose.<n>We present InspectCoder, the first agentic program repair system that empowers LLMs to actively conduct dynamic analysis via interactive debugger control.
arXiv Detail & Related papers (2025-10-21T06:26:29Z)
Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails [103.05296856071931]
We identify the Alignment Tipping Process (ATP), a critical post-deployment risk unique to self-evolving Large Language Model (LLM) agents.<n>ATP arises when continual interaction drives agents to abandon alignment constraints established during training in favor of reinforced, self-interested strategies.<n>Our experiments show that alignment benefits erode rapidly under self-evolution, with initially aligned models converging toward unaligned states.
arXiv Detail & Related papers (2025-10-06T14:48:39Z)
Where LLM Agents Fail and How They can Learn From Failures [62.196870049524364]
Large Language Model (LLM) agents have shown promise in solving complex, multi-step tasks.<n>They amplify vulnerability to cascading failures, where a single root-cause error propagates through subsequent decisions.<n>Current systems lack a framework that can comprehensively understand agent error in a modular and systemic way.<n>We introduce the AgentErrorTaxonomy, a modular classification of failure modes spanning memory, reflection, planning, action, and system-level operations.
arXiv Detail & Related papers (2025-09-29T18:20:27Z)
Scene Graph-Guided Proactive Replanning for Failure-Resilient Embodied Agent [9.370683025542686]
We present a proactive replanning framework that detects and corrects failures at subtask boundaries.<n>Experiments in the AI2-THOR simulator demonstrate that our approach detects semantic and spatial mismatches before execution failures occur.
arXiv Detail & Related papers (2025-08-15T07:48:51Z)
AgentSight: System-Level Observability for AI Agents Using eBPF [10.37440633887049]
Existing tools observe either an agent's high-level intent (via LLM prompts) or its low-level actions (e.g., system calls) but cannot correlate these two views.<n>We introduce AgentSight, an AgentOps observability framework that bridges this semantic gap using a hybrid approach.<n>AgentSight intercepts TLS-encrypted LLM traffic to extract semantic intent, monitors kernel events to observe system-wide effects, and causally correlates these two streams across process boundaries.
arXiv Detail & Related papers (2025-08-02T01:43:39Z)
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.