Related papers: Agent-Driven Automatic Software Improvement

Agent-Driven Automatic Software Improvement

URL: http://arxiv.org/abs/2406.16739v1
Date: Mon, 24 Jun 2024 15:45:22 GMT
Title: Agent-Driven Automatic Software Improvement
Authors: Fernando Vallecillos Ruiz,
Abstract summary: This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
Score: 55.2480439325792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With software maintenance accounting for 50% of the cost of developing software, enhancing code quality and reliability has become more critical than ever. In response to this challenge, this doctoral research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) to perform software maintenance tasks. The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. One distinct challenge is the last-mile problems, errors at the final stage of producing functionally and contextually relevant code. Furthermore, this project aims to surpass the inherent limitations of current LLMs in source code through a collaborative framework where agents can correct and learn from each other's errors. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement. Our main goal is to achieve a leap forward in the field of automatic software improvement by developing new tools and frameworks that can enhance the efficiency and reliability of software development.

Related papers

Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents. We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning. We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z)
CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation [24.090719826360342]
We introduce CodeIF, the first benchmark designed to assess the abilities of Large Language Models (LLMs) to adhere to task-oriented instructions within code generation scenarios. We conduct extensive experiments with LLMs, analyzing their strengths and limitations in meeting the demands of these tasks.
arXiv Detail & Related papers (2025-02-26T14:19:49Z)
Enhancing Trust in Language Model-Based Code Optimization through RLHF: A Research Design [0.0]
This research aims to develop reliable, LM-powered methods for code optimization that effectively integrate human feedback. This work aligns with the broader objectives of advancing cooperative and human-centric aspects of software engineering.
arXiv Detail & Related papers (2025-02-10T18:48:45Z)
LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [62.12404317786005]
EvoCoder is a continuous learning framework for issue code reproduction. Our results show a 20% improvement in issue reproduction rates over existing SOTA methods.
arXiv Detail & Related papers (2024-11-21T08:49:23Z)
Human-In-the-Loop Software Development Agents [12.830816751625829]
Large Language Models (LLMs) are introduced to automatically resolve software development tasks. We introduce a Human-in-the-loop LLM-based Agents framework (HULA) for software development. We design, implement, and deploy the HULA framework into Atlassian for internal uses.
arXiv Detail & Related papers (2024-11-19T23:22:33Z)
A Comprehensive Survey of AI-Driven Advancements and Techniques in Automated Program Repair and Code Generation [0.0]
27 recent papers have been reviewed and split into two groups. The first group consists of new methods for bug detection and repair, which include locating semantic errors. The second group dwells on code generation, providing an overview of both general-purpose LLMs fine-tuned for programming and task-specific models. It also presents methods to improve code generation, such as identifier-aware training, fine-tuning at the instruction level, and incorporating semantic code structures.
arXiv Detail & Related papers (2024-11-12T06:47:54Z)
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities. Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z)
AutoSafeCoder: A Multi-Agent Framework for Securing LLM Code Generation through Static Analysis and Fuzz Testing [6.334110674473677]
Existing approaches often rely on a single agent for code generation, which struggles to produce secure, vulnerability-free code. We propose AutoSafeCoder, a multi-agent framework that leverages LLM-driven agents for code generation, vulnerability analysis, and security enhancement through continuous collaboration. Our contribution focuses on ensuring the safety of multi-agent code generation by integrating dynamic and static testing in an iterative process during code generation.
arXiv Detail & Related papers (2024-09-16T21:15:56Z)
AIvril: AI-Driven RTL Generation With Verification In-The-Loop [0.7831852829409273]
Large Language Models (LLMs) are computational models capable of performing complex natural language processing tasks. This paper introduces AIvril, a framework designed to enhance the accuracy and reliability of RTL-aware LLMs.
arXiv Detail & Related papers (2024-09-03T15:07:11Z)
Agentless: Demystifying LLM-based Software Engineering Agents [12.19683999553113]
We build Agentless -- an agentless approach to automatically solve software development problems. Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic three-phase process of localization, repair, and patch validation. Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance and low cost.
arXiv Detail & Related papers (2024-07-01T17:24:45Z)
Automatic Programming: Large Language Models and Beyond [48.34544922560503]
We study concerns around code quality, security and related issues of programmer responsibility. We discuss how advances in software engineering can enable automatic programming. We conclude with a forward looking view, focusing on the programming environment of the near future.
arXiv Detail & Related papers (2024-05-03T16:19:24Z)
Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development. We introduce Experiential Co-Learning, a novel LLM-agent learning framework. Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z)
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning. During inference, we introduce a new generation procedure with a critical sampling strategy. For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.