Related papers: CatchAll: Repository-Aware Exception Handling with Knowledge-Guided LLMs

CatchAll: Repository-Aware Exception Handling with Knowledge-Guided LLMs

URL: http://arxiv.org/abs/2601.01271v1
Date: Sat, 03 Jan 2026 20:03:03 GMT
Title: CatchAll: Repository-Aware Exception Handling with Knowledge-Guided LLMs
Authors: Qingxiao Tao, Xiaodong Gu, Hao Zhong, Beijun Shen,
Abstract summary: Exception handling is a vital forward error-recovery mechanism in many programming languages.<n>We propose CatchAll, a novel approach for repository-aware exception handling.<n>To evaluate CatchAll, we construct two new benchmarks for repository-aware exception handling.
Score: 11.461605017230424
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Exception handling is a vital forward error-recovery mechanism in many programming languages, enabling developers to manage runtime anomalies through structured constructs (e.g., try-catch blocks). Improper or missing exception handling often leads to severe consequences, including system crashes and resource leaks. While large language models (LLMs) have demonstrated strong capabilities in code generation, they struggle with exception handling at the repository level, due to complex dependencies and contextual constraints. In this work, we propose CatchAll, a novel LLM-based approach for repository-aware exception handling. CatchAll equips LLMs with three complementary layers of exception-handling knowledge: (1) API-level exception knowledge, obtained from an empirically constructed API-exception mapping that characterizes the exception-throwing behaviors of APIs in real-world codebases; (2) repository-level execution context, which captures exception propagation by modeling contextual call traces around the target code; and (3) cross-repository handling knowledge, distilled from reusable exception-handling patterns mined from historical code across projects. The knowledge is encoded into structured prompts to guide the LLM in generating accurate and context-aware exception-handling code. To evaluate CatchAll, we construct two new benchmarks for repository-aware exception handling: a large-scale dataset RepoExEval and an executable subset RepoExEval-Exec. Experiments demonstrate that RepoExEval consistently outperforms state-of-the-art baselines, achieving a CodeBLEU score of 0.31 (vs. 0.27% for the best baseline), intent prediction accuracy of 60.1% (vs. 48.0%), and Pass@1 of 29% (vs. 25%). These results affirm RepoExEval's effectiveness in real-world repository-level exception handling.

Related papers

External Data Extraction Attacks against Retrieval-Augmented Large Language Models [70.47869786522782]
RAG has emerged as a key paradigm for enhancing large language models (LLMs)<n>RAG introduces new risks of external data extraction attacks (EDEAs), where sensitive or copyrighted data in its knowledge base may be extracted verbatim.<n>We present the first comprehensive study to formalize EDEAs against retrieval-augmented LLMs.
arXiv Detail & Related papers (2025-10-03T12:53:45Z)
Where LLM Agents Fail and How They can Learn From Failures [62.196870049524364]
Large Language Model (LLM) agents have shown promise in solving complex, multi-step tasks.<n>They amplify vulnerability to cascading failures, where a single root-cause error propagates through subsequent decisions.<n>Current systems lack a framework that can comprehensively understand agent error in a modular and systemic way.<n>We introduce the AgentErrorTaxonomy, a modular classification of failure modes spanning memory, reflection, planning, action, and system-level operations.
arXiv Detail & Related papers (2025-09-29T18:20:27Z)
RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models [49.83481415540291]
Large Language Models (LLMs) have exhibited significant proficiency in code debug.<n>This paper introduces Repo Debug, a multi-task and multi-language repository-level code debug dataset.<n>We conduct evaluation experiments on 10 LLMs, where Claude 3.5 Sonnect, the best-performing model, still cannot perform well in repository-level debug.
arXiv Detail & Related papers (2025-09-04T10:13:21Z)
SHIELDA: Structured Handling of Exceptions in LLM-Driven Agentic Workflows [12.727172180194653]
Large Language Model (LLM) agentic systems are software systems powered by LLMs that autonomously reason, plan, and execute multi-step processes.<n>Existing exception handling solutions often treat exceptions superficially, failing to trace execution-phase exceptions to their root causes.<n>We present SHIELDA, a modular exception handling framework for LLM agentic runtimes.
arXiv Detail & Related papers (2025-08-11T12:50:46Z)
MRG-Bench: Evaluating and Exploring the Requirements of Context for Repository-Level Code Generation [0.7342677574855649]
We introduce textbfMRG-Bench, a novel dataset that provides a more accurate evaluation of large language models.<n>We conduct experiments including large language models, long-context models, and RAG-related methods.<n>Results show that the majority of methods suffer from "textbfdifficulty in understanding user requirements," failing to comprehend their assigned tasks accurately.
arXiv Detail & Related papers (2025-08-05T01:53:45Z)
Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction [80.88654868264645]
Arranged and Organized Extraction Benchmark designed to evaluate ability of large language models to comprehend fragmented documents.<n>AOE includes 11 carefully crafted tasks across three diverse domains, requiring models to generate context-specific schema tailored to varied input queries.<n>Results show that even the most advanced models struggled significantly.
arXiv Detail & Related papers (2025-07-22T06:37:51Z)
Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework [58.36391985790157]
In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code.<n>We explore the use of large language models (LLMs) to improve exception handling in code.<n>We propose Seeker, a multi-agent framework inspired by expert developer strategies for exception handling.
arXiv Detail & Related papers (2024-12-16T12:35:29Z)
Towards Exception Safety Code Generation with Intermediate Representation Agents Framework [54.03528377384397]
Large Language Models (LLMs) often struggle with robust exception handling in generated code, leading to fragile programs that are prone to runtime errors.<n>We propose Seeker, a novel multi-agent framework that enforces exception safety in LLM generated code through an Intermediate Representation (IR) approach.<n>Seeker decomposes exception handling into five specialized agents: Scanner, Detector, Predator, Ranker, and Handler.
arXiv Detail & Related papers (2024-10-09T14:45:45Z)
From Misuse to Mastery: Enhancing Code Generation with Knowledge-Driven AI Chaining [16.749379740049925]
Large Language Models (LLMs) have shown promising results in automatic code generation by improving coding efficiency to a certain extent. However, generating high-quality and reliable code remains a formidable task because of LLMs' lack of good programming practice. We propose a novel Knowledge-driven Prompt Chaining-based code generation approach, which decomposes code generation into an AI chain with iterative check-rewrite steps.
arXiv Detail & Related papers (2023-09-27T12:09:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.