Improving Performance of Commercially Available AI Products in a Multi-Agent Configuration
- URL: http://arxiv.org/abs/2410.22129v1
- Date: Tue, 29 Oct 2024 15:28:19 GMT
- Title: Improving Performance of Commercially Available AI Products in a Multi-Agent Configuration
- Authors: Cory Hymel, Sida Peng, Kevin Xu, Charath Ranganathan,
- Abstract summary: Crowdbotics PRD AI is a tool for generating software requirements using AI.
GitHub Copilot is an AI pair-programming tool.
By sharing business requirements from PRD AI, we improve the code suggestion capabilities of GitHub Copilot by 13.8% and developer task success rate by 24.5%.
- Score: 11.626057561212694
- License:
- Abstract: In recent years, with the rapid advancement of large language models (LLMs), multi-agent systems have become increasingly more capable of practical application. At the same time, the software development industry has had a number of new AI-powered tools developed that improve the software development lifecycle (SDLC). Academically, much attention has been paid to the role of multi-agent systems to the SDLC. And, while single-agent systems have frequently been examined in real-world applications, we have seen comparatively few real-world examples of publicly available commercial tools working together in a multi-agent system with measurable improvements. In this experiment we test context sharing between Crowdbotics PRD AI, a tool for generating software requirements using AI, and GitHub Copilot, an AI pair-programming tool. By sharing business requirements from PRD AI, we improve the code suggestion capabilities of GitHub Copilot by 13.8% and developer task success rate by 24.5% -- demonstrating a real-world example of commercially-available AI systems working together with improved outcomes.
Related papers
- Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework [8.28588489551341]
This paper presents CAMP, a multi-model AI-assisted programming framework that consists of a local model that employs Retrieval-Augmented Generation (RAG)
RAG retrieves contextual information from the cloud model to facilitate context-aware prompt construction.
The methodology is actualized in Copilot for Xcode, an AI-assisted programming tool crafted for the Apple software ecosystem.
arXiv Detail & Related papers (2024-10-20T04:51:24Z) - OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.
We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z) - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks.
SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs.
We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z) - AutoCodeRover: Autonomous Program Improvement [8.66280420062806]
We propose an automated approach for solving GitHub issues to autonomously achieve program improvement.
In our approach called AutoCodeRover, LLMs are combined with sophisticated code search capabilities, ultimately leading to a program modification or patch.
Experiments on SWE-bench-lite (300 real-life GitHub issues) show increased efficacy in solving GitHub issues (19% on SWE-bench-lite), which is higher than the efficacy of the recently reported SWE-agent.
arXiv Detail & Related papers (2024-04-08T11:55:09Z) - OS-Copilot: Towards Generalist Computer Agents with Self-Improvement [48.29860831901484]
We introduce OS-Copilot, a framework to build generalist agents capable of interfacing with comprehensive elements in an operating system (OS)
We use OS-Copilot to create FRIDAY, a self-improving embodied agent for automating general computer tasks.
On GAIA, a general AI assistants benchmark, FRIDAY outperforms previous methods by 35%, showcasing strong generalization to unseen applications via accumulated skills from previous tasks.
arXiv Detail & Related papers (2024-02-12T07:29:22Z) - An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents.
Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction.
We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z) - Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development.
We introduce Experiential Co-Learning, a novel LLM-agent learning framework.
Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z) - The GitHub Development Workflow Automation Ecosystems [47.818229204130596]
Large-scale software development has become a highly collaborative endeavour.
This chapter explores the ecosystems of development bots and GitHub Actions.
It provides an extensive survey of the state-of-the-art in this domain.
arXiv Detail & Related papers (2023-05-08T15:24:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.