Related papers: SHIELDA: Structured Handling of Exceptions in LLM-Driven Agentic Workflows

SHIELDA: Structured Handling of Exceptions in LLM-Driven Agentic Workflows

URL: http://arxiv.org/abs/2508.07935v1
Date: Mon, 11 Aug 2025 12:50:46 GMT
Title: SHIELDA: Structured Handling of Exceptions in LLM-Driven Agentic Workflows
Authors: Jingwen Zhou, Jieshan Chen, Qinghua Lu, Dehai Zhao, Liming Zhu,
Abstract summary: Large Language Model (LLM) agentic systems are software systems powered by LLMs that autonomously reason, plan, and execute multi-step processes.<n>Existing exception handling solutions often treat exceptions superficially, failing to trace execution-phase exceptions to their root causes.<n>We present SHIELDA, a modular exception handling framework for LLM agentic runtimes.
Score: 12.727172180194653
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLM) agentic systems are software systems powered by LLMs that autonomously reason, plan, and execute multi-step workflows to achieve human goals, rather than merely executing predefined steps. During execution, these workflows frequently encounter exceptions. Existing exception handling solutions often treat exceptions superficially, failing to trace execution-phase exceptions to their reasoning-phase root causes. Furthermore, their recovery logic is brittle, lacking structured escalation pathways when initial attempts fail. To tackle these challenges, we first present a comprehensive taxonomy of 36 exception types across 12 agent artifacts. Building on this, we propose SHIELDA (Structured Handling of Exceptions in LLM-Driven Agentic Workflows), a modular runtime exception handling framework for LLM agentic workflows. SHIELDA uses an exception classifier to select a predefined exception handling pattern from a handling pattern registry. These patterns are then executed via a structured handling executor, comprising local handling, flow control, and state recovery, to enable phase-aware recovery by linking exceptions to their root causes and facilitating composable strategies. We validate SHIELDA's effectiveness through a case study on the AutoPR agent, demonstrating effective, cross-phase recovery from a reasoning-induced exception.

Related papers

AgentRx: Diagnosing AI Agent Failures from Execution Trajectories [9.61742219198197]
We release a benchmark of 115 failed trajectories spanning structured API, incident management, and open-ended web/file tasks.<n>Each trajectory is annotated with a critical failure step and a category from a grounded-theory derived, cross-domain failure taxonomy.<n>We present AGENTRX, an automated domain-agnostic diagnostic framework that pinpoints the critical failure step in a failed agent trajectory.
arXiv Detail & Related papers (2026-02-02T18:54:07Z)
BackdoorAgent: A Unified Framework for Backdoor Attacks on LLM-based Agents [58.83028403414688]
Large language model (LLM) agents execute tasks through multi-step workflow that combine planning, memory, and tool use.<n>Backdoor triggers injected into specific stages of an agent workflow can persist through multiple intermediate states and adversely influence downstream outputs.<n>We propose textbfBackdoorAgent, a modular and stage-aware framework that provides a unified agent-centric view of backdoor threats in LLM agents.
arXiv Detail & Related papers (2026-01-08T03:49:39Z)
CatchAll: Repository-Aware Exception Handling with Knowledge-Guided LLMs [11.461605017230424]
Exception handling is a vital forward error-recovery mechanism in many programming languages.<n>We propose CatchAll, a novel approach for repository-aware exception handling.<n>To evaluate CatchAll, we construct two new benchmarks for repository-aware exception handling.
arXiv Detail & Related papers (2026-01-03T20:03:03Z)
Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement [61.35824395228412]
Large language model (LLM) based agents are increasingly used to tackle software engineering tasks.<n>We propose Self-Abstraction from Grounded Experience (SAGE), a framework that enables agents to learn from their own task executions.
arXiv Detail & Related papers (2025-11-08T08:49:38Z)
TraceAegis: Securing LLM-Based Agents via Hierarchical and Behavioral Anomaly Detection [31.243042511018675]
We propose TraceAegis, a provenance-based analysis framework that leverages agent execution traces to detect potential anomalies.<n>By validating execution traces against both hierarchical and behavioral constraints, TraceAegis is able to effectively detect abnormal behaviors.
arXiv Detail & Related papers (2025-10-13T09:35:06Z)
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive.<n>We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons.<n>The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z)
UDora: A Unified Red Teaming Framework against LLM Agents by Dynamically Hijacking Their Own Reasoning [17.448966928905733]
Large Language Model (LLM) agents equipped with external tools have become increasingly powerful for complex tasks.<n>We present UDora, a unified red teaming framework designed for LLM agents that dynamically hijacks the agent's reasoning processes to compel malicious behavior.
arXiv Detail & Related papers (2025-02-28T21:30:28Z)
FlowAgent: Achieving Compliance and Flexibility for Workflow Agents [31.088578094151178]
FlowAgent is a novel agent framework designed to maintain both compliance and flexibility.<n>Building on PDL, we develop a comprehensive framework that empowers LLMs to manage OOW queries effectively.<n>We present a new evaluation methodology to rigorously assess an LLM agent's ability to handle OOW scenarios.
arXiv Detail & Related papers (2025-02-20T07:59:31Z)
Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework [58.36391985790157]
In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code.<n>We explore the use of large language models (LLMs) to improve exception handling in code.<n>We propose Seeker, a multi-agent framework inspired by expert developer strategies for exception handling.
arXiv Detail & Related papers (2024-12-16T12:35:29Z)
Towards Exception Safety Code Generation with Intermediate Representation Agents Framework [54.03528377384397]
Large Language Models (LLMs) often struggle with robust exception handling in generated code, leading to fragile programs that are prone to runtime errors.<n>We propose Seeker, a novel multi-agent framework that enforces exception safety in LLM generated code through an Intermediate Representation (IR) approach.<n>Seeker decomposes exception handling into five specialized agents: Scanner, Detector, Predator, Ranker, and Handler.
arXiv Detail & Related papers (2024-10-09T14:45:45Z)
LASER: LLM Agent with State-Space Exploration for Web Navigation [57.802977310392755]
Large language models (LLMs) have been successfully adapted for interactive decision-making tasks like web navigation. Previous methods implicitly assume a forward-only execution mode for the model, where they only provide oracle trajectories as in-context examples. We propose to model the interactive task as state space exploration, where the LLM agent transitions among a pre-defined set of states by performing actions to complete the task.
arXiv Detail & Related papers (2023-09-15T05:44:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.