TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning
- URL: http://arxiv.org/abs/2602.11409v1
- Date: Wed, 11 Feb 2026 22:23:56 GMT
- Title: TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning
- Authors: Sina Tayebati, Divake Kumar, Nastaran Darabi, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi,
- Abstract summary: Existing uncertainty proxies focus on single-shot text generation.<n>We introduce TRACER, a trajectory-level uncertainty metric for dual-control Tool-Agent-User interaction.
- Score: 4.928838343487574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating uncertainty for AI agents in real-world multi-turn tool-using interaction with humans is difficult because failures are often triggered by sparse critical episodes (e.g., looping, incoherent tool use, or user-agent miscoordination) even when local generation appears confident. Existing uncertainty proxies focus on single-shot text generation and therefore miss these trajectory-level breakdown signals. We introduce TRACER, a trajectory-level uncertainty metric for dual-control Tool-Agent-User interaction. TRACER combines content-aware surprisal with situational-awareness signals, semantic and lexical repetition, and tool-grounded coherence gaps, and aggregates them using a tail-focused risk functional with a MAX-composite step risk to surface decisive anomalies. We evaluate TRACER on $τ^2$-bench by predicting task failure and selective task execution. To this end, TRACER improves AUROC by up to 37.1% and AUARC by up to 55% over baselines, enabling earlier and more accurate detection of uncertainty in complex conversational tool-use settings. Our code and benchmark are available at https://github.com/sinatayebati/agent-tracer.
Related papers
- Beyond Input Guardrails: Reconstructing Cross-Agent Semantic Flows for Execution-Aware Attack Detection [32.301679396929536]
We propose SysName, a framework that shifts the defensive paradigm from static input filtering to execution-aware analysis.<n>SysName synthesizes fragmented operational primitives into contiguous behavioral trajectories, enabling a holistic view of system activity.<n> Empirical evaluations demonstrate that SysName effectively detects over ten distinct compound attack vectors, achieving F1-scores of 85.3% and 66.7% for node-level and path-level end-to-end attack detection, respectively.
arXiv Detail & Related papers (2026-03-04T01:59:16Z) - ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction [24.416258744287166]
ICON is a probing-to-mitigation framework that neutralizes attacks while preserving task continuity.<n>ICON achieves a competitive 0.4% ASR, matching commercial grade detectors, while yielding a over 50% task utility gain.
arXiv Detail & Related papers (2026-02-24T09:13:05Z) - Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents [35.76774274440008]
STING (Sequential Testing of Illicit N-step Goal execution) is an automated red-teaming framework.<n>It constructs a step-by-step illicit plan grounded in a benign persona and iteratively probes a target agent with adaptive follow-ups.<n>We introduce an analysis framework that models multi-turn red-teaming as a time-to-first-jailbreak random variable.
arXiv Detail & Related papers (2026-02-18T10:31:19Z) - Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents [68.20752678837377]
We propose a principled taxonomy that transforms single-turn harmful tasks into multi-turn attack sequences.<n>Using this taxonomy, we construct MT-AgentRisk, the first benchmark to evaluate multi-turn tool-using agent safety.<n>We propose ToolShield, a training-free, tool-agnostic, self-exploration defense.
arXiv Detail & Related papers (2026-02-13T18:38:18Z) - ARTIS: Agentic Risk-Aware Test-Time Scaling via Iterative Simulation [72.78362530982109]
ARTIS, Agentic Risk-Aware Test-Time Scaling via Iterative Simulation, is a framework that decouples exploration from commitment.<n>We show that naive LLM-based simulators struggle to capture rare but high-impact failure modes.<n>We introduce a risk-aware tool simulator that emphasizes fidelity on failure-inducing actions.
arXiv Detail & Related papers (2026-02-02T06:33:22Z) - The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z) - Towards Compositional Generalization in LLMs for Smart Contract Security: A Case Study on Reentrancy Vulnerabilities [35.39583123277091]
This paper proposes a post-training algorithm based on atomic task decomposition and fusion.<n>We decompose the reentrancy vulnerability detection task into four linearly independent atomic tasks.<n>By training on synthetic datasets, we generate three compiler-verified datasets.<n>We then employ the Slither tool to extract structural information from the control flow graph and data flow graph.
arXiv Detail & Related papers (2026-01-11T13:52:07Z) - ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration [68.89572566071575]
ETAgent is a training framework for calibrating agent's tool-use behavior.<n>It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors.
arXiv Detail & Related papers (2026-01-11T11:05:26Z) - Agentic Rubrics as Contextual Verifiers for SWE Agents [8.469998524915818]
We show that Agentic rubrics provide an efficient, scalable, and granular verification signal for SWE agents.<n>Results show that Agentic rubrics are consistent with ground-truth tests while also flagging issues that tests do not capture.
arXiv Detail & Related papers (2026-01-07T18:38:23Z) - Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction [58.51530390018909]
Large Language Model based multi-agent systems excel at collaborative problem solving but remain brittle to cascading errors.<n>We present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction.
arXiv Detail & Related papers (2025-10-16T05:35:37Z) - Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents [58.00130492861884]
TraitBasis is a lightweight, model-agnostic method for systematically stress testing AI agents.<n>TraitBasis learns directions in activation space corresponding to steerable user traits.<n>We observe on average a 2%-30% performance degradation on $tau$-Trait across frontier models.
arXiv Detail & Related papers (2025-10-06T05:03:57Z) - Federated Spatiotemporal Graph Learning for Passive Attack Detection in Smart Grids [2.721477719641864]
This paper introduces a graph-centric, multimodal detector that fuses physical-layer and behavioral indicators over temporal windows to detect passive attacks.<n>The model achieves a testing accuracy of 98.32% per-timestep and 93.35% per-sequence at 0.15% FPR.
arXiv Detail & Related papers (2025-09-29T08:52:30Z) - Automatic Failure Attribution and Critical Step Prediction Method for Multi-Agent Systems Based on Causal Inference [8.823529310904162]
Multi-agent systems (MAS) are critical for automating complex tasks, yet their practical deployment is hampered by the challenge of failure attribution.<n>We introduce the first failure attribution framework for MAS grounded in multi-granularity causal inference.
arXiv Detail & Related papers (2025-09-10T15:22:00Z) - Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.