GHIssuemarket: A Sandbox Environment for SWE-Agents Economic Experimentation
- URL: http://arxiv.org/abs/2412.11722v2
- Date: Tue, 17 Dec 2024 17:06:43 GMT
- Title: GHIssuemarket: A Sandbox Environment for SWE-Agents Economic Experimentation
- Authors: Mohamed A. Fouad, Marcelo de Almeida Maia,
- Abstract summary: we argue the importance of swe-agents' economic viability to their transcendence.
We introduce ghissuemarket sandbox, a virtual environment for swe-agents' economic experimentation.
We open-source our software artifacts, discuss our sandbox engineering decisions, and advocate towards swe-agents' economic exploration.
- Score: 0.4910937238451484
- License:
- Abstract: Software engineering agents (swe-agents), as key innovations in intelligent software engineering, are poised in the industry's end-of-programming debate to transcend from assistance to primary roles. we argue the importance of swe-agents' economic viability to their transcendence -- defined as their capacity to maintain efficient operations in constrained environments -- and propose its exploration via software engineering economics experimentation.we introduce ghissuemarket sandbox, a controlled virtual environment for swe-agents' economic experimentation, simulating the environment of an envisioned peer-to-peer multiagent system for github issues outsourcing auctions. in this controlled setting, autonomous swe-agents auction and bid on github issues, leveraging real-time communication, a built-in retrieval-augmented generation (rag) interface for effective decision-making, and instant cryptocurrency micropayments. we open-source our software artifacts, discuss our sandbox engineering decisions, and advocate towards swe-agents' economic exploration -- an emerging field we intend to pursue under the term intelligent software engineering economics (isee).
Related papers
- TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks [52.46737975742287]
We build a self-contained environment with data that mimics a small software company environment.
We find that with the most competitive agent, 24% of the tasks can be completed autonomously.
This paints a nuanced picture on task automation with LM agents.
arXiv Detail & Related papers (2024-12-18T18:55:40Z) - Transforming the Hybrid Cloud for Emerging AI Workloads [81.15269563290326]
This white paper envisions transforming hybrid cloud systems to meet the growing complexity of AI workloads.
The proposed framework addresses critical challenges in energy efficiency, performance, and cost-effectiveness.
This joint initiative aims to establish hybrid clouds as secure, efficient, and sustainable platforms.
arXiv Detail & Related papers (2024-11-20T11:57:43Z) - Quantum Computing and Neuromorphic Computing for Safe, Reliable, and explainable Multi-Agent Reinforcement Learning: Optimal Control in Autonomous Robotics [0.0]
This paper investigates the utilization of Quantum Computing and Neuromorphic Computing for Safe, Reliable, and Explainable Multi_Agent Reinforcement Learning (MARL)
The objective was to address the challenges of optimizing the behavior of autonomous agents while ensuring safety, reliability, and explainability.
arXiv Detail & Related papers (2024-07-29T15:43:30Z) - Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence [79.5316642687565]
Existing multi-agent frameworks often struggle with integrating diverse capable third-party agents.
We propose the Internet of Agents (IoA), a novel framework that addresses these limitations.
IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control.
arXiv Detail & Related papers (2024-07-09T17:33:24Z) - Agentless: Demystifying LLM-based Software Engineering Agents [12.19683999553113]
We build Agentless -- an agentless approach to automatically solve software development problems.
Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic three-phase process of localization, repair, and patch validation.
Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance and low cost.
arXiv Detail & Related papers (2024-07-01T17:24:45Z) - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks.
SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs.
We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z) - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers.
WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform.
BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z) - QuantAgent: Seeking Holy Grail in Trading by Self-Improving Large
Language Model [14.800710112671226]
This paper introduces a principled framework to address the core challenge of efficiently building and integrating a domain-specific knowledge base.
In the inner loop, the agent refines its responses by drawing from its knowledge base, while in the outer loop, these responses are tested in real-world scenarios.
We instantiate this framework through an autonomous agent for mining trading signals named QuantAgent.
arXiv Detail & Related papers (2024-02-06T06:47:14Z) - Embedded Software Development with Digital Twins: Specific Requirements
for Small and Medium-Sized Enterprises [55.57032418885258]
Digital twins have the potential for cost-effective software development and maintenance strategies.
We interviewed SMEs about their current development processes.
First results show that real-time requirements prevent, to date, a Software-in-the-Loop development approach.
arXiv Detail & Related papers (2023-09-17T08:56:36Z) - Symbiotic System Design for Safe and Resilient Autonomous Robotics in
Offshore Wind Farms [3.5409202655473724]
Barriers to Beyond Visual Line of Sight (BVLOS) robotics include operational safety compliance and resilience.
We propose a symbiotic system; reflecting the lifecycle learning and co-evolution with knowledge sharing for mutual gain of robotic platforms and remote human operators.
Our methodology enables the run-time verification of safety, reliability and resilience during autonomous missions.
arXiv Detail & Related papers (2021-01-23T11:58:16Z) - mt5b3: A Framework for Building AutonomousTraders [0.0]
Many AI techniques have been tested in finance field including recent approaches likeconvolutional neural networks and deep reinforcement learning.
We present some fundamental aspects of modelling autonomoustraders and the complex environment that is the financialworld.
We believe that mt5b3 may also contribute todevelopment of new autonomous traders.
arXiv Detail & Related papers (2021-01-20T15:01:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.