EnvX: Agentize Everything with Agentic AI
- URL: http://arxiv.org/abs/2509.08088v1
- Date: Tue, 09 Sep 2025 18:51:36 GMT
- Title: EnvX: Agentize Everything with Agentic AI
- Authors: Linyao Chen, Zimian Peng, Yingxuan Yang, Yikun Wang, Wenzheng Tom Tang, Hiroki H. Kobayashi, Weinan Zhang,
- Abstract summary: We present EnvX, a framework that leverages Agentic AI to agentize GitHub repositories.<n>EnvX reimagines repositories as active agents through a three-phase process.<n>We evaluate EnvX on the GitTaskBench benchmark, using 18 repositories across domains such as image processing, speech recognition, document analysis, and video manipulation.
- Score: 18.805404564291965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The widespread availability of open-source repositories has led to a vast collection of reusable software components, yet their utilization remains manual, error-prone, and disconnected. Developers must navigate documentation, understand APIs, and write integration code, creating significant barriers to efficient software reuse. To address this, we present EnvX, a framework that leverages Agentic AI to agentize GitHub repositories, transforming them into intelligent, autonomous agents capable of natural language interaction and inter-agent collaboration. Unlike existing approaches that treat repositories as static code resources, EnvX reimagines them as active agents through a three-phase process: (1) TODO-guided environment initialization, which sets up the necessary dependencies, data, and validation datasets; (2) human-aligned agentic automation, allowing repository-specific agents to autonomously perform real-world tasks; and (3) Agent-to-Agent (A2A) protocol, enabling multiple agents to collaborate. By combining large language model capabilities with structured tool integration, EnvX automates not just code generation, but the entire process of understanding, initializing, and operationalizing repository functionality. We evaluate EnvX on the GitTaskBench benchmark, using 18 repositories across domains such as image processing, speech recognition, document analysis, and video manipulation. Our results show that EnvX achieves a 74.07% execution completion rate and 51.85% task pass rate, outperforming existing frameworks. Case studies further demonstrate EnvX's ability to enable multi-repository collaboration via the A2A protocol. This work marks a shift from treating repositories as passive code resources to intelligent, interactive agents, fostering greater accessibility and collaboration within the open-source ecosystem.
Related papers
- RepoLaunch: Automating Build&Test Pipeline of Code Repositories on ANY Language and ANY Platform [49.43594274832262]
We introduce RepoLaunch, the first agent capable of automatically resolving dependencies, compiling source code, and extracting test results for repositories across arbitrary programming languages and operating systems.<n>RepoLaunch automates the remaining steps, enabling scalable benchmarking and training of coding agents and LLMs.
arXiv Detail & Related papers (2026-03-05T10:15:13Z) - ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development [72.4729759618632]
We introduce ABC-Bench, a benchmark to evaluate agentic backend coding within a realistic, executable workflow.<n>We curated 224 practical tasks spanning 8 languages and 19 frameworks from open-source repositories.<n>Our evaluation reveals that even state-of-the-art models struggle to deliver reliable performance on these holistic tasks.
arXiv Detail & Related papers (2026-01-16T08:23:52Z) - UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist [107.04196084992907]
We introduce UniVA, an omni-capable multi-agent framework for next-generation video generalists.<n>UniVA employs a Plan-and-Act dual-agent architecture that drives a highly automated and proactive workflow.<n>We also introduce UniVA-Bench, a benchmark suite of multi-step video tasks spanning understanding, editing, segmentation, and generation.
arXiv Detail & Related papers (2025-11-11T17:58:13Z) - The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents [46.254487394746725]
We present the OpenHands Software Agent SDK, a toolkit for implementing software development agents.<n>To achieve flexibility, we design a simple interface for implementing agents that requires only a few lines of code in the default case.<n>For security and reliability, it delivers seamless local-to-remote execution portability, integrated REST/WebSocket services.
arXiv Detail & Related papers (2025-11-05T18:16:44Z) - RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving [32.044286648565524]
RepoMaster is an autonomous agent framework designed to explore and reuse GitHub repositories for solving complex tasks.<n>RepoMaster constructs function-call graphs, module-dependency graphs, and hierarchical code trees to identify essential components.<n>On our newly released GitTaskBench, RepoMaster lifts the task-pass rate from 40.7% to 62.9% while reducing token usage by 95%.
arXiv Detail & Related papers (2025-05-27T08:35:05Z) - Cerebrum (AIOS SDK): A Platform for Agent Development, Deployment, Distribution, and Discovery [33.89476893368382]
We present Cerebrum, an Agent SDK for AIOS that addresses the gap through three key components.<n>A comprehensive SDK featuring a modular four-layer architecture for agent development; (2) a community-driven Agent Hub for sharing and discovering agents; and (3) an interactive web interface for testing and evaluating agents.<n>Cerebrum advances the field by providing a unified framework that standardizes agent development while maintaining flexibility for researchers and developers to innovate and distribute their agents.
arXiv Detail & Related papers (2025-03-14T14:29:17Z) - RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph [63.87660059104077]
We present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions.<n>RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks.
arXiv Detail & Related papers (2024-10-03T05:45:26Z) - MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents [7.4159044558995335]
We introduce MOSS (llM-oriented Operating System Simulation), a novel framework that integrates code generation with a dynamic context management system.
At its core, the framework employs an Inversion of Control container in conjunction with decorators to enforce the least knowledge principle.
We show how this framework can enhance the efficiency and capabilities of agent development and highlight its advantages in moving towards Turing-complete agents.
arXiv Detail & Related papers (2024-09-24T14:30:21Z) - Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [64.19431011897515]
This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution.<n>Our approach introduces a top-down method to condense critical repository information into a knowledge graph, reducing complexity, and employs a Monte Carlo tree search based strategy.<n>In production deployment and evaluation at Alibaba Cloud, LingmaAgent automatically resolved 16.9% of in-house issues faced by development engineers, and solved 43.3% of problems after manual intervention.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks.
SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs.
We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z) - AutoDev: Automated AI-Driven Development [9.586330606828643]
AutoDev is a fully automated AI-driven software development framework.
It enables users to define complex software engineering objectives, which are assigned to AutoDev's autonomous AI Agents.
AutoDev establishes a secure development environment by confining all operations within Docker containers.
arXiv Detail & Related papers (2024-03-13T07:12:03Z) - RepoAgent: An LLM-Powered Open-Source Framework for Repository-level
Code Documentation Generation [79.83270415843857]
We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation.
We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
arXiv Detail & Related papers (2024-02-26T15:39:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.