Deploy-Master: Automating the Deployment of 50,000+ Agent-Ready Scientific Tools in One Day
- URL: http://arxiv.org/abs/2601.03513v1
- Date: Wed, 07 Jan 2026 02:00:13 GMT
- Title: Deploy-Master: Automating the Deployment of 50,000+ Agent-Ready Scientific Tools in One Day
- Authors: Yi Wang, Zhenting Huang, Zhaohan Ding, Ruoxue Liao, Yuan Huang, Xinzijian Liu, Jiajun Xie, Siheng Chen, Linfeng Zhang,
- Abstract summary: Deploy-Master is a one-stop agentic workflow for large-scale tool discovery, build specification inference, execution-based validation, and publication.<n>In a single day, we performed 52,550 build attempts and constructed reproducible environments for 50,112 scientific tools.<n>We report a deployment trace at the scale of 50,000 tools, characterizing throughput, cost profiles, failure surfaces, and specification uncertainty that become visible only at scale.
- Score: 37.83274797886782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-source scientific software is abundant, yet most tools remain difficult to compile, configure, and reuse, sustaining a small-workshop mode of scientific computing. This deployment bottleneck limits reproducibility, large-scale evaluation, and the practical integration of scientific tools into modern AI-for-Science (AI4S) and agentic workflows. We present Deploy-Master, a one-stop agentic workflow for large-scale tool discovery, build specification inference, execution-based validation, and publication. Guided by a taxonomy spanning 90+ scientific and engineering domains, our discovery stage starts from a recall-oriented pool of over 500,000 public repositories and progressively filters it to 52,550 executable tool candidates under license- and quality-aware criteria. Deploy-Master transforms heterogeneous open-source repositories into runnable, containerized capabilities grounded in execution rather than documentation claims. In a single day, we performed 52,550 build attempts and constructed reproducible runtime environments for 50,112 scientific tools. Each successful tool is validated by a minimal executable command and registered in SciencePedia for search and reuse, enabling direct human use and optional agent-based invocation. Beyond delivering runnable tools, we report a deployment trace at the scale of 50,000 tools, characterizing throughput, cost profiles, failure surfaces, and specification uncertainty that become visible only at scale. These results explain why scientific software remains difficult to operationalize and motivate shared, observable execution substrates as a foundation for scalable AI4S and agentic science.
Related papers
- SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents [100.12367115920121]
We introduce SciGymAgent, a scalable interactive environment featuring 1,780 domain-specific tools across four natural science disciplines.<n>We also present SciAgentBench, a tiered evaluation suite designed to stress-test agentic capabilities.
arXiv Detail & Related papers (2026-02-13T14:58:18Z) - Bohrium + SciMaster: Building the Infrastructure and Ecosystem for Agentic Science at Scale [82.20980951765891]
We argue that scaling agentic science requires an infrastructure-and-ecosystem approach, instantiated Bohrium+SciMaster.<n>Bohrium acts as a managed, traceable hub for AI4S assets that turns diverse scientific data, software, compute, and laboratory systems into agent-ready capabilities.<n>SciMaster orchestrates these capabilities into long-horizon scientific, on which scientific agents can be composed and executed.
arXiv Detail & Related papers (2025-12-23T16:04:41Z) - An Agentic Framework for Autonomous Materials Computation [70.24472585135929]
Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery.<n>Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific experiments.<n>Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations.
arXiv Detail & Related papers (2025-12-22T15:03:57Z) - Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization [69.36509281190662]
Adapting production-level computer vision tools to bespoke scientific datasets is a critical "last mile" bottleneck.<n>We consider using AI agents to automate this manual coding, and focus on the open question of optimal agent design.<n>We demonstrate that a simple agent framework consistently generates adaptation code that outperforms human-expert solutions.
arXiv Detail & Related papers (2025-12-02T18:42:26Z) - The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution [86.4588675093384]
Toolathlon is a benchmark for language agents offering diverse Apps and tools, realistic environment setup, and reliable execution-based evaluation.<n>This benchmark includes 108 manually sourced or crafted tasks, requiring interacting with multiple Apps over around 20 turns on average to complete.<n>We expect Toolathlon to drive the development of more capable language agents for real-world, long-horizon task execution.
arXiv Detail & Related papers (2025-10-29T17:32:49Z) - MCPVerse: An Expansive, Real-World Benchmark for Agentic Tool Use [72.53177559476704]
We introduce MCPVerse, a real-world benchmark for evaluating agentic tool use.<n> MCPVerse integrates more than 550 real-world, executable tools to create an unprecedented action space exceeding 140k tokens.<n>We benchmarked the state-of-the-art LLMs across three modes (Oracle, Standard, and Max-Scale)
arXiv Detail & Related papers (2025-08-22T09:47:53Z) - SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration [39.43814195462455]
SciToolAgent automates hundreds of scientific tools across biology, chemistry, and materials science.<n>The agent also incorporates a comprehensive safety-checking module to ensure responsible and ethical tool usage.
arXiv Detail & Related papers (2025-07-27T13:55:35Z) - LLM Agents Making Agent Tools [2.5529148902034637]
Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks.<n>But these tools must be implemented in advance by human developers.<n>We propose ToolMaker, an agentic framework that autonomously transforms papers with code into LLM-compatible tools.
arXiv Detail & Related papers (2025-02-17T11:44:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.