Self-collaboration Code Generation via ChatGPT
- URL: http://arxiv.org/abs/2304.07590v3
- Date: Sat, 11 May 2024 14:00:45 GMT
- Title: Self-collaboration Code Generation via ChatGPT
- Authors: Yihong Dong, Xue Jiang, Zhi Jin, Ge Li,
- Abstract summary: Large Language Models (LLMs) have demonstrated remarkable code-generation ability, but struggle with complex tasks.
We present a self-collaboration framework for code generation employing LLMs, exemplified by ChatGPT.
To effectively organize and manage this virtual team, we incorporate software-development methodology into the framework.
- Score: 35.88318116340547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although Large Language Models (LLMs) have demonstrated remarkable code-generation ability, they still struggle with complex tasks. In real-world software development, humans usually tackle complex tasks through collaborative teamwork, a strategy that significantly controls development complexity and enhances software quality. Inspired by this, we present a self-collaboration framework for code generation employing LLMs, exemplified by ChatGPT. Specifically, through role instructions, 1) Multiple LLM agents act as distinct `experts', each responsible for a specific subtask within a complex task; 2) Specify the way to collaborate and interact, so that different roles form a virtual team to facilitate each other's work, ultimately the virtual team addresses code generation tasks collaboratively without the need for human intervention. To effectively organize and manage this virtual team, we incorporate software-development methodology into the framework. Thus, we assemble an elementary team consisting of three LLM roles (i.e., analyst, coder, and tester) responsible for software development's analysis, coding, and testing stages. We conduct comprehensive experiments on various code-generation benchmarks. Experimental results indicate that self-collaboration code generation relatively improves 29.9%-47.1% Pass@1 compared to the base LLM agent. Moreover, we showcase that self-collaboration could potentially enable LLMs to efficiently handle complex repository-level tasks that are not readily solved by the single LLM agent.
Related papers
- CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories [4.579838836114489]
Large Language Models (LLMs) have demonstrated significant advancements across various fields of computer science research.
We introduce CSR-Bench, a benchmark for Computer Science Research projects.
We also introduce a novel framework, CSR-Agents, that utilizes multiple LLM agents to automate the deployment of GitHub code repositories.
arXiv Detail & Related papers (2025-02-10T02:46:29Z) - When One LLM Drools, Multi-LLM Collaboration Rules [98.71562711695991]
We argue for multi-LLM collaboration to better represent the extensive diversity of data, skills, and people.
We organize existing multi-LLM collaboration methods into a hierarchy, based on the level of access and information exchange.
We envision multi-LLM collaboration as an essential path toward compositional intelligence and collaborative AI development.
arXiv Detail & Related papers (2025-02-06T21:13:44Z) - VisionCoder: Empowering Multi-Agent Auto-Programming for Image Processing with Hybrid LLMs [8.380216582290025]
This paper presents a multi-agent framework that collaboratively completes auto-programming tasks.
Each agent plays a distinct role in the software development cycle, collectively forming a virtual organisation.
By establishing a tree-structured thought distribution and development mechanism across project, module, and function levels, this framework offers a cost-effective and efficient solution.
arXiv Detail & Related papers (2024-10-25T01:52:15Z) - ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems [80.69865295743149]
This work attempts to study using LLM-based agents to design collaborative AI systems autonomously.
Based on ComfyBench, we develop ComfyAgent, a framework that empowers agents to autonomously design collaborative AI systems by generating.
While ComfyAgent achieves a comparable resolve rate to o1-preview and significantly surpasses other agents on ComfyBench, ComfyAgent has resolved only 15% of creative tasks.
arXiv Detail & Related papers (2024-09-02T17:44:10Z) - Multi-Agent Software Development through Cross-Team Collaboration [30.88149502999973]
We introduce Cross-Team Collaboration (CTC), a scalable multi-team framework for software development.
CTC enables orchestrated teams to jointly propose various decisions and communicate with their insights.
Results show a notable increase in quality compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-06-13T10:18:36Z) - Automatic Robotic Development through Collaborative Framework by Large
Language Models [13.957351735394683]
We propose an innovative automated collaboration framework inspired by real-world robot developers.
This framework employs multiple LLMs in distinct roles analysts, programmers, and testers.
Analysts delve deep into user requirements, enabling programmers to produce precise code, while testers fine-tune the parameters.
arXiv Detail & Related papers (2024-02-06T04:40:27Z) - Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development.
We introduce Experiential Co-Learning, a novel LLM-agent learning framework.
Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z) - TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation.
Specifically, task decomposition, tool selection, and parameter prediction are assessed.
Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z) - Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration [83.4031923134958]
Corex is a suite of novel general-purpose strategies that transform Large Language Models into autonomous agents.
Inspired by human behaviors, Corex is constituted by diverse collaboration paradigms including Debate, Review, and Retrieve modes.
We demonstrate that orchestrating multiple LLMs to work in concert yields substantially better performance compared to existing methods.
arXiv Detail & Related papers (2023-09-30T07:11:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.