Extracting Conceptual Knowledge to Locate Software Issues
- URL: http://arxiv.org/abs/2509.21427v2
- Date: Sat, 04 Oct 2025 15:20:21 GMT
- Title: Extracting Conceptual Knowledge to Locate Software Issues
- Authors: Ying Wang, Wenjun Mao, Chong Wang, Zhenhao Zhou, Yicheng Zhou, Wenyun Zhao, Yiling Lou, Xin Peng,
- Abstract summary: RepoLens is a novel approach that abstracts and leverages conceptual knowledge from code repositories.<n>It operates in two stages: an offline stage that extracts conceptual knowledge into a repository-wide knowledge base, and an online stage that retrieves issue-specific terms.<n>RepoLens consistently improves three state-of-the-art tools, achieving average gains of over 22% in Hit@k and 46% in Recall@k for file- and function-level localization.
- Score: 12.746044344302623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Issue localization, which identifies faulty code elements such as files or functions, is critical for effective bug fixing. While recent LLM-based and LLM-agent-based approaches improve accuracy, they struggle in large-scale repositories due to concern tangling, where relevant logic is buried in large functions, and concern scattering, where related logic is dispersed across files. To address these challenges, we propose RepoLens, a novel approach that abstracts and leverages conceptual knowledge from code repositories. RepoLens decomposes fine-grained functionalities and recomposes them into high-level concerns, semantically coherent clusters of functionalities that guide LLMs. It operates in two stages: an offline stage that extracts and enriches conceptual knowledge into a repository-wide knowledge base, and an online stage that retrieves issue-specific terms, clusters and ranks concerns by relevance, and integrates them into localization workflows via minimally intrusive prompt enhancements. We evaluate RepoLens on SWE-Lancer-Loc, a benchmark of 216 tasks derived from SWE-Lancer. RepoLens consistently improves three state-of-the-art tools, namely AgentLess, OpenHands, and mini-SWE-agent, achieving average gains of over 22% in Hit@k and 46% in Recall@k for file- and function-level localization. It generalizes across models (GPT-4o, GPT-4o-mini, GPT-4.1) with Hit@1 and Recall@10 gains up to 504% and 376%, respectively. Ablation studies and manual evaluation confirm the effectiveness and reliability of the constructed concerns.
Related papers
- LIDL: LLM Integration Defect Localization via Knowledge Graph-Enhanced Multi-Agent Analysis [16.217842423570055]
We propose LIDL, a multi-agent framework for defect localization in large language models-integrated software.<n>We evaluate LIDL on 146 real-world defect instances collected from 105 GitHub repositories and 16 agent-based systems.
arXiv Detail & Related papers (2026-01-09T05:47:59Z) - Eigen-1: Adaptive Multi-Agent Refinement with Monitor-Based RAG for Scientific Reasoning [53.45095336430027]
We develop a unified framework that combines implicit retrieval and structured collaboration.<n>On Humanity's Last Exam (HLE) Bio/Chem Gold, our framework achieves 48.3% accuracy.<n>Results on SuperGPQA and TRQA confirm robustness across domains.
arXiv Detail & Related papers (2025-09-25T14:05:55Z) - Enhancing LLM-based Fault Localization with a Functionality-Aware Retrieval-Augmented Generation Framework [14.287359838639608]
FaR-Loc is a framework that enhances method-level fault localization.<n> FaR-Loc consists of three key components: LLM Functionality Extraction, Semantic Retrieval, and LLM Re-ranking.<n>Our experiments on the widely used Defects4J benchmark show that FaR-Loc outperforms state-of-the-art LLM-based baselines.
arXiv Detail & Related papers (2025-09-24T20:37:11Z) - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z) - Document Attribution: Examining Citation Relationships using Large Language Models [62.46146670035751]
We propose a zero-shot approach that frames attribution as a straightforward textual entailment task.<n>We also explore the role of the attention mechanism in enhancing the attribution process.
arXiv Detail & Related papers (2025-05-09T04:40:11Z) - SweRank: Software Issue Localization with Code Ranking [109.3289316191729]
SweRank is an efficient retrieve-and-rerank framework for software issue localization.<n>We construct SweLoc, a large-scale dataset curated from public GitHub repositories.<n>We show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems.
arXiv Detail & Related papers (2025-05-07T19:44:09Z) - CoSIL: Issue Localization via LLM-Driven Code Graph Searching [10.396193740828032]
This paper introduces CoSIL, an LLM-driven, powerful function-level issue localization method without training or indexing.<n>Experiment results demonstrate that CoSIL achieves a Top-1 localization accuracy of 43.3% and 44.6% on SWE-bench Lite and SWE-bench Verified.<n>When CoSIL is integrated into an issue-solving method, Agentless, the issue resolution rate improves by 2.98%--30.5%.
arXiv Detail & Related papers (2025-03-28T13:36:26Z) - KGCompass: Knowledge Graph Enhanced Repository-Level Software Repair [13.747293341707563]
Repository-level software repair faces challenges in bridging semantic gaps between issue descriptions and code patches.<n>Existing approaches, which rely on large language models (LLMs), are hindered by semantic ambiguities, limited understanding of structural context, and insufficient reasoning capabilities.<n>We propose a novel repository-aware knowledge graph (KG) that accurately links repository artifacts (issues and pull requests) and entities (files, classes, and functions)<n>A path-guided repair mechanism that leverages KG-mined paths, tracing through which allows us to augment contextual information along with explanations.
arXiv Detail & Related papers (2025-03-27T17:21:47Z) - A Multi-Agent Approach to Fault Localization via Graph-Based Retrieval and Reflexion [8.22737389683156]
Traditional fault localization techniques require extensive training datasets and high computational resources.<n>Recent advances in Large Language Models (LLMs) offer new opportunities by enhancing code understanding and reasoning.<n>We propose LLM4FL, a multi-agent fault localization framework that utilizes three specialized LLM agents.<n> evaluated on the Defects4J benchmark, which includes 675 faults from 14 Java projects, LLM4FL achieves an 18.55% improvement in Top-1 accuracy over AutoFL and 4.82% over SoapFL.
arXiv Detail & Related papers (2024-09-20T16:47:34Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.