Related papers: Autonomous Issue Resolver: Towards Zero-Touch Code Maintenance

Autonomous Issue Resolver: Towards Zero-Touch Code Maintenance

URL: http://arxiv.org/abs/2512.08492v1
Date: Tue, 09 Dec 2025 11:11:37 GMT
Title: Autonomous Issue Resolver: Towards Zero-Touch Code Maintenance
Authors: Aliaksei Kaliutau,
Abstract summary: We propose a paradigm shift from the standard Code Property Graphs to the concept of Data Transformation Graph (DTG)<n>We introduce a multi-agent framework that reconciles data integrity navigation with control flow logic.<n>Our approach has demonstrated good results on several SWE benchmarks, reaching a resolution rate of 87.1%.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent advances in Large Language Models have revolutionized function-level code generation; however, repository-scale Automated Program Repair (APR) remains a significant challenge. Current approaches typically employ a control-centric paradigm, forcing agents to navigate complex directory structures and irrelevant control logic. In this paper, we propose a paradigm shift from the standard Code Property Graphs (CPGs) to the concept of Data Transformation Graph (DTG) that inverts the topology by modeling data states as nodes and functions as edges, enabling agents to trace logic defects through data lineage rather than control flow. We introduce a multi-agent framework that reconciles data integrity navigation with control flow logic. Our theoretical analysis and case studies demonstrate that this approach resolves the "Semantic Trap" inherent in standard RAG systems in modern coding agents. We provide a comprehensive implementation in the form of Autonomous Issue Resolver (AIR), a self-improvement system for zero-touch code maintenance that utilizes neuro-symbolic reasoning and uses the DTG structure for scalable logic repair. Our approach has demonstrated good results on several SWE benchmarks, reaching a resolution rate of 87.1% on SWE-Verified benchmark. Our approach directly addresses the core limitations of current AI code-assistant tools and tackles the critical need for a more robust foundation for our increasingly software-dependent world.

Related papers

Beyond Quantity: Trajectory Diversity Scaling for Code Agents [51.71414642763219]
Trajectory Diversity Scaling is a data synthesis framework for code agents that scales performance through diversity rather than raw volume.<n> TDScaling integrates four innovations: (1) a Business Cluster mechanism that captures real-service logical dependencies; (2) a blueprint-driven multi-agent paradigm that enforces trajectory coherence; and (3) an adaptive evolution mechanism that steers toward long-tail scenarios.
arXiv Detail & Related papers (2026-02-03T07:43:03Z)
AI-NativeBench: An Open-Source White-Box Agentic Benchmark Suite for AI-Native Systems [52.65695508605237]
We introduce AI-NativeBench, the first application-centric and white-box AI-Native benchmark suite grounded in Model Context Protocol (MCP) and Agent-to-Agent (A2A) standards.<n>By treating agentic spans as first-class citizens within distributed traces, our methodology enables granular analysis of engineering characteristics beyond simple capabilities.<n>This work provides the first systematic evidence to guide the transition from measuring model capability to engineering reliable AI-Native systems.
arXiv Detail & Related papers (2026-01-14T11:32:07Z)
SALVE: Sparse Autoencoder-Latent Vector Editing for Mechanistic Control of Neural Networks [0.0]
We present SALVE, a framework that bridges mechanistic interpretability and model editing.<n>We learn a sparse, model-native feature basis without supervision.<n>We validate these features with Grad-FAM, a feature-level saliency mapping method.
arXiv Detail & Related papers (2025-12-17T20:06:03Z)
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence [150.3696990310269]
Large language models (LLMs) have transformed automated software development by enabling direct translation of natural language descriptions into functional code.<n>We provide a comprehensive synthesis and practical guide (a series of analytic and probing experiments) about code LLMs.<n>We analyze the code capability of the general LLMs (GPT-4, Claude, LLaMA) and code-specialized LLMs (StarCoder, Code LLaMA, DeepSeek-Coder, and QwenCoder)
arXiv Detail & Related papers (2025-11-23T17:09:34Z)
Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction [58.51530390018909]
Large Language Model based multi-agent systems excel at collaborative problem solving but remain brittle to cascading errors.<n>We present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction.
arXiv Detail & Related papers (2025-10-16T05:35:37Z)
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning [65.20602712957725]
Caco is a novel framework that automates the synthesis of high-quality, verifiable, and diverse instruction-CoT reasoning data.<n>Our work establishes a paradigm for building self-sustaining, trustworthy reasoning systems without human intervention.
arXiv Detail & Related papers (2025-10-05T07:59:24Z)
RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning [27.235259453535537]
RationAnomaly is a novel framework that enhances log anomaly detection by synergizing Chain-of-Thought fine-tuning with reinforcement learning.<n>We have released the corresponding resources, including code and datasets.
arXiv Detail & Related papers (2025-09-18T07:35:58Z)
VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation [9.07044866283158]
We introduce VeriReason, a framework integrating supervised fine-tuning with Guided Reward Proximal Optimization (GRPO) reinforcement learning for RTL generation.<n>On the VerilogEval Benchmark, VeriReason delivers 83.1% functional correctness, substantially outperforming both comparable-sized models and much larger commercial systems like GPT-4 Turbo.<n>VeriReason represents the first system to successfully integrate explicit reasoning capabilities with reinforcement learning for Verilog generation, establishing a new state-of-the-art for automated RTL synthesis.
arXiv Detail & Related papers (2025-05-17T05:25:01Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
PID Control-Based Self-Healing to Improve the Robustness of Large Language Models [23.418411870842178]
Minor perturbations can significantly reduce the performance of well-trained language models. We construct a computationally efficient self-healing process to correct undesired model behavior. The proposed PID control-based self-healing is a low cost framework that improves the robustness of pre-trained large language models.
arXiv Detail & Related papers (2024-03-31T23:46:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.