Experimenting with Multi-Agent Software Development: Towards a Unified Platform
- URL: http://arxiv.org/abs/2406.05381v1
- Date: Sat, 8 Jun 2024 07:27:01 GMT
- Title: Experimenting with Multi-Agent Software Development: Towards a Unified Platform
- Authors: Malik Abdul Sami, Muhammad Waseem, Zeeshan Rasheed, Mika Saari, Kari Systä, Pekka Abrahamsson,
- Abstract summary: Large language models are redefining software engineering by implementing AI-powered techniques throughout the whole software development process.
This study is to develop a unified platform that utilizes multiple artificial intelligence agents to automate the process of transforming user requirements into well-organized deliverables.
The platform will organize tasks, perform security and compliance, and suggest design patterns and improvements for non-functional requirements.
- Score: 3.3485481369444674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models are redefining software engineering by implementing AI-powered techniques throughout the whole software development process, including requirement gathering, software architecture, code generation, testing, and deployment. However, it is still difficult to develop a cohesive platform that consistently produces the best outcomes across all stages. The objective of this study is to develop a unified platform that utilizes multiple artificial intelligence agents to automate the process of transforming user requirements into well-organized deliverables. These deliverables include user stories, prioritization, and UML sequence diagrams, along with the modular approach to APIs, unit tests, and end-to-end tests. Additionally, the platform will organize tasks, perform security and compliance, and suggest design patterns and improvements for non-functional requirements. We allow users to control and manage each phase according to their preferences. In addition, the platform provides security and compliance checks following European standards and proposes design optimizations. We use multiple models, such as GPT-3.5, GPT-4, and Llama3 to enable to generation of modular code as per user choice. The research also highlights the limitations and future research discussions to overall improve the software development life cycle. The source code for our uniform platform is hosted on GitHub, enabling additional experimentation and supporting both research and practical uses. \end
Related papers
- Verbal Process Supervision Elicits Better Coding Agents [0.9558392439655016]
This work introduces CURA, a code understanding and reasoning agent system enhanced with verbal process supervision (VPS)
CURA achieves a 3.65% improvement over baseline models on challenging benchmarks like BigCodeBench.
arXiv Detail & Related papers (2025-03-24T09:48:59Z) - SEAlign: Alignment Training for Software Engineering Agent [38.05820118124528]
We propose SEAlign to bridge the gap between code generation models and real-world software development tasks.
We evaluate SEAlign on three standard agentic benchmarks for real-world software engineering, including HumanEvalFix, SWE-Bench-Lite, and SWE-Bench-Verified.
We develop an agent-based software development platform using SEAlign, which successfully automates the creation of several small applications.
arXiv Detail & Related papers (2025-03-24T08:59:21Z) - Chattronics: using GPTs to assist in the design of data acquisition systems [0.0]
This article presents a novel approach to use General Pre-Trained Transformers to assist in the design phase of data acquisition systems.
The solution is packaged in the form of an application that retains the conversational aspects of LLMs.
After 160 test iterations, the study concludes that there is potential for these models to serve adequately as synthesis/assistant tools for data acquisition systems.
arXiv Detail & Related papers (2024-09-23T16:36:16Z) - Think-on-Process: Dynamic Process Generation for Collaborative Development of Multi-Agent System [13.65717444483291]
ToP (Think-on-Process) is a dynamic process generation framework for software development.
Our framework significantly enhances the dynamic process generation capability of the GPT-3.5 and GPT-4.
arXiv Detail & Related papers (2024-09-10T15:02:34Z) - SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents.
We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z) - DevBench: A Comprehensive Benchmark for Software Development [72.24266814625685]
DevBench is a benchmark that evaluates large language models (LLMs) across various stages of the software development lifecycle.
Empirical studies show that current LLMs, including GPT-4-Turbo, fail to solve the challenges presented within DevBench.
Our findings offer actionable insights for the future development of LLMs toward real-world programming applications.
arXiv Detail & Related papers (2024-03-13T15:13:44Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - TestLab: An Intelligent Automated Software Testing Framework [0.0]
TestLab is an automated software testing framework that attempts to gather a set of testing methods and automate them using Artificial Intelligence.
The first two modules aim to identify vulnerabilities from different perspectives, while the third module enhances traditional automated software testing by automatically generating test cases.
arXiv Detail & Related papers (2023-06-06T11:45:22Z) - CodeTF: One-stop Transformer Library for State-of-the-art Code LLM [72.1638273937025]
We present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
Our library supports a collection of pretrained Code LLM models and popular code benchmarks.
We hope CodeTF is able to bridge the gap between machine learning/generative AI and software engineering.
arXiv Detail & Related papers (2023-05-31T05:24:48Z) - AwesomeMeta+: A Mixed-Prototyping Meta-Learning System Supporting AI Application Design Anywhere [26.774915735789595]
AwesomeMeta+ is a prototyping and learning system designed to standardize the key components of meta-learning.
It is developed to support the full lifecycle of meta-learning system engineering, from design to deployment.
arXiv Detail & Related papers (2023-04-24T03:09:25Z) - ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data
Format [88.33443450434521]
Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants.
Existing toolkits for building TOD systems often fall short of in delivering comprehensive arrays of data, models, and experimental environments.
We introduce ConvLab-3: a multifaceted dialogue system toolkit crafted to bridge this gap.
arXiv Detail & Related papers (2022-11-30T16:37:42Z) - Enabling Automated Machine Learning for Model-Driven AI Engineering [60.09869520679979]
We propose a novel approach to enable Model-Driven Software Engineering and Model-Driven AI Engineering.
In particular, we support Automated ML, thus assisting software engineers without deep AI knowledge in developing AI-intensive systems.
arXiv Detail & Related papers (2022-03-06T10:12:56Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z) - PHOTONAI -- A Python API for Rapid Machine Learning Model Development [2.414341608751139]
PHOTONAI is a high-level Python API designed to simplify and accelerate machine learning model development.
It functions as a unifying framework allowing the user to easily access and combine algorithms from different toolboxes into custom algorithm sequences.
arXiv Detail & Related papers (2020-02-13T10:33:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.