Related papers: TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning

TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning

URL: http://arxiv.org/abs/2602.11409v1
Date: Wed, 11 Feb 2026 22:23:56 GMT
Title: TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning
Authors: Sina Tayebati, Divake Kumar, Nastaran Darabi, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi,
Abstract summary: Existing uncertainty proxies focus on single-shot text generation.<n>We introduce TRACER, a trajectory-level uncertainty metric for dual-control Tool-Agent-User interaction.
Score: 4.928838343487574
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Estimating uncertainty for AI agents in real-world multi-turn tool-using interaction with humans is difficult because failures are often triggered by sparse critical episodes (e.g., looping, incoherent tool use, or user-agent miscoordination) even when local generation appears confident. Existing uncertainty proxies focus on single-shot text generation and therefore miss these trajectory-level breakdown signals. We introduce TRACER, a trajectory-level uncertainty metric for dual-control Tool-Agent-User interaction. TRACER combines content-aware surprisal with situational-awareness signals, semantic and lexical repetition, and tool-grounded coherence gaps, and aggregates them using a tail-focused risk functional with a MAX-composite step risk to surface decisive anomalies. We evaluate TRACER on $τ^2$-bench by predicting task failure and selective task execution. To this end, TRACER improves AUROC by up to 37.1% and AUARC by up to 55% over baselines, enabling earlier and more accurate detection of uncertainty in complex conversational tool-use settings. Our code and benchmark are available at https://github.com/sinatayebati/agent-tracer.

Related papers

Beyond Input Guardrails: Reconstructing Cross-Agent Semantic Flows for Execution-Aware Attack Detection [32.301679396929536]
We propose SysName, a framework that shifts the defensive paradigm from static input filtering to execution-aware analysis.<n>SysName synthesizes fragmented operational primitives into contiguous behavioral trajectories, enabling a holistic view of system activity.<n> Empirical evaluations demonstrate that SysName effectively detects over ten distinct compound attack vectors, achieving F1-scores of 85.3% and 66.7% for node-level and path-level end-to-end attack detection, respectively.
arXiv Detail & Related papers (2026-03-04T01:59:16Z)
ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction [24.416258744287166]
ICON is a probing-to-mitigation framework that neutralizes attacks while preserving task continuity.<n>ICON achieves a competitive 0.4% ASR, matching commercial grade detectors, while yielding a over 50% task utility gain.
arXiv Detail & Related papers (2026-02-24T09:13:05Z)
Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents [35.76774274440008]
STING (Sequential Testing of Illicit N-step Goal execution) is an automated red-teaming framework.<n>It constructs a step-by-step illicit plan grounded in a benign persona and iteratively probes a target agent with adaptive follow-ups.<n>We introduce an analysis framework that models multi-turn red-teaming as a time-to-first-jailbreak random variable.
arXiv Detail & Related papers (2026-02-18T10:31:19Z)
Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents [68.20752678837377]
We propose a principled taxonomy that transforms single-turn harmful tasks into multi-turn attack sequences.<n>Using this taxonomy, we construct MT-AgentRisk, the first benchmark to evaluate multi-turn tool-using agent safety.<n>We propose ToolShield, a training-free, tool-agnostic, self-exploration defense.
arXiv Detail & Related papers (2026-02-13T18:38:18Z)
ARTIS: Agentic Risk-Aware Test-Time Scaling via Iterative Simulation [72.78362530982109]
ARTIS, Agentic Risk-Aware Test-Time Scaling via Iterative Simulation, is a framework that decouples exploration from commitment.<n>We show that naive LLM-based simulators struggle to capture rare but high-impact failure modes.<n>We introduce a risk-aware tool simulator that emphasizes fidelity on failure-inducing actions.
arXiv Detail & Related papers (2026-02-02T06:33:22Z)
The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution [63.61358761489141]
Large Language Model (LLM)-based agents are widely used in real-world applications such as customer service, web navigation, and software engineering.<n>We propose a novel framework for textbfgeneral agentic attribution, designed to identify the internal factors driving agent actions regardless of the task outcome.<n>We validate our framework across a diverse suite of agentic scenarios, including standard tool use and subtle reliability risks like memory-induced bias.
arXiv Detail & Related papers (2026-01-21T15:22:21Z)
Towards Compositional Generalization in LLMs for Smart Contract Security: A Case Study on Reentrancy Vulnerabilities [35.39583123277091]
This paper proposes a post-training algorithm based on atomic task decomposition and fusion.<n>We decompose the reentrancy vulnerability detection task into four linearly independent atomic tasks.<n>By training on synthetic datasets, we generate three compiler-verified datasets.<n>We then employ the Slither tool to extract structural information from the control flow graph and data flow graph.
arXiv Detail & Related papers (2026-01-11T13:52:07Z)
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration [68.89572566071575]
ETAgent is a training framework for calibrating agent's tool-use behavior.<n>It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors.
arXiv Detail & Related papers (2026-01-11T11:05:26Z)
Agentic Rubrics as Contextual Verifiers for SWE Agents [8.469998524915818]
We show that Agentic rubrics provide an efficient, scalable, and granular verification signal for SWE agents.<n>Results show that Agentic rubrics are consistent with ground-truth tests while also flagging issues that tests do not capture.
arXiv Detail & Related papers (2026-01-07T18:38:23Z)
Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction [58.51530390018909]
Large Language Model based multi-agent systems excel at collaborative problem solving but remain brittle to cascading errors.<n>We present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction.
arXiv Detail & Related papers (2025-10-16T05:35:37Z)
Impatient Users Confuse AI Agents: High-fidelity Simulations of Human Traits for Testing Agents [58.00130492861884]
TraitBasis is a lightweight, model-agnostic method for systematically stress testing AI agents.<n>TraitBasis learns directions in activation space corresponding to steerable user traits.<n>We observe on average a 2%-30% performance degradation on $tau$-Trait across frontier models.
arXiv Detail & Related papers (2025-10-06T05:03:57Z)
Federated Spatiotemporal Graph Learning for Passive Attack Detection in Smart Grids [2.721477719641864]
This paper introduces a graph-centric, multimodal detector that fuses physical-layer and behavioral indicators over temporal windows to detect passive attacks.<n>The model achieves a testing accuracy of 98.32% per-timestep and 93.35% per-sequence at 0.15% FPR.
arXiv Detail & Related papers (2025-09-29T08:52:30Z)
Automatic Failure Attribution and Critical Step Prediction Method for Multi-Agent Systems Based on Causal Inference [8.823529310904162]
Multi-agent systems (MAS) are critical for automating complex tasks, yet their practical deployment is hampered by the challenge of failure attribution.<n>We introduce the first failure attribution framework for MAS grounded in multi-granularity causal inference.
arXiv Detail & Related papers (2025-09-10T15:22:00Z)
Dissecting Adversarial Robustness of Multimodal LM Agents [70.2077308846307]
We manually create 200 targeted adversarial tasks and evaluation scripts in a realistic threat model on top of VisualWebArena.<n>We find that we can successfully break latest agents that use black-box frontier LMs, including those that perform reflection and tree search.<n>We also use ARE to rigorously evaluate how the robustness changes as new components are added.
arXiv Detail & Related papers (2024-06-18T17:32:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.