Related papers: Empirical Research on Utilizing LLM-based Agents for Automated Bug Fixing via LangGraph

Empirical Research on Utilizing LLM-based Agents for Automated Bug Fixing via LangGraph

URL: http://arxiv.org/abs/2502.18465v1
Date: Wed, 29 Jan 2025 12:01:00 GMT
Title: Empirical Research on Utilizing LLM-based Agents for Automated Bug Fixing via LangGraph
Authors: Jialin Wang, Zhihua Duan,
Abstract summary: The proposed system integrates three core components LangGraph, GLM4 Flash, and ChromaDB within a four step iterative workflow to deliver robust performance and functionality seamless.<n>LangGraph serves as a graph-based library for orchestrating tasks, providing precise control and execution while maintaining a unified state object for dynamic updates and consistency.<n>GLM4 Flash, a large language model, leverages its advanced capabilities in natural language understanding, contextual reasoning, and multilingual support to generate accurate code snippets based on user prompts.
Score: 1.4582633500696451
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a novel framework for automated code generation and debugging, designed to improve accuracy, efficiency, and scalability in software development. The proposed system integrates three core components LangGraph, GLM4 Flash, and ChromaDB within a four step iterative workflow to deliver robust performance and seamless functionality. LangGraph serves as a graph-based library for orchestrating tasks, providing precise control and execution while maintaining a unified state object for dynamic updates and consistency. It supports multi-agent, hierarchical, and sequential processes, making it highly adaptable to complex software engineering workflows. GLM4 Flash, a large language model, leverages its advanced capabilities in natural language understanding, contextual reasoning, and multilingual support to generate accurate code snippets based on user prompts. ChromaDB acts as a vector database for semantic search and contextual memory storage, enabling the identification of patterns and the generation of context-aware bug fixes based on historical data. The system operates through a structured four-step process: (1) Code Generation, which translates natural language descriptions into executable code; (2) Code Execution, which validates the code by identifying runtime errors and inconsistencies; (3) Code Repair, which iteratively refines buggy code using ChromaDB's memory capabilities and LangGraph's state tracking; and (4) Code Update, which ensures the code meets functional and performance requirements through iterative modifications.

Related papers

Code-Craft: Hierarchical Graph-Based Code Summarization for Enhanced Context Retrieval [0.0]
We present Hierarchical Code Graph Summarization (HCGS), a novel approach that constructs a multi-layered representation of a by generating structured summaries in a bottom-up fashion from a code graph. HCGS consistently outperforms traditional code-only retrieval across all metrics.
arXiv Detail & Related papers (2025-04-11T20:57:27Z)
Investigating Execution-Aware Language Models for Code Optimization [7.62248558265865]
This study investigates how incorporating code execution information into language models affects their ability to optimize code. Our results indicate that execution-aware models provide limited benefits compared to the standard CodeT5+ model in optimizing code.
arXiv Detail & Related papers (2025-03-11T09:46:07Z)
ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [49.04652315815501]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools.<n>We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z)
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance [0.6062751776009752]
Large Language Models (LLMs) have shown incredible potential in code generation tasks. LLMs can generate code based on task descriptions, but accuracy remains limited. We introduce a novel architecture of LLM-based agents for code generation and automatic debug: Refinement and Guidance debugger (RGD) RGD decomposes the code generation task into multiple steps, ensuring a clearer workflow and enabling iterative code refinement based on self-reflection and feedback.
arXiv Detail & Related papers (2024-10-02T05:07:02Z)
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases [13.733229886643041]
Large Language Models (LLMs) excel in stand-alone code tasks like HumanEval and MBPP, but struggle with handling entire code repositories. Similarity-based retrieval often has low recall in complex tasks, while manual tools and APIs are typically task-specific and require expert knowledge. We introduce CodexGraph, a system that integrates LLM agents with graph database interfaces extracted from code repositories.
arXiv Detail & Related papers (2024-08-07T17:13:59Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs. CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language. Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks. To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z)
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment. Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution. We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)
GraphCodeBERT: Pre-training Code Representations with Data Flow [97.00641522327699]
We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code. We use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables. We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement.
arXiv Detail & Related papers (2020-09-17T15:25:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.