CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution
- URL: http://arxiv.org/abs/2512.23880v1
- Date: Mon, 29 Dec 2025 21:50:23 GMT
- Title: CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution
- Authors: Xu Huang, Junwu Chen, Yuxing Fei, Zhuohan Li, Philippe Schwaller, Gerbrand Ceder,
- Abstract summary: Large language model (LLM) agents currently depend on predefined tools or brittle tool generation.<n>We introduce CASCADE, a self-evolving agentic framework representing an early instantiation of the transition from "LLM + tool use" to "LLM + skill acquisition"
- Score: 7.266404572341558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language model (LLM) agents currently depend on predefined tools or brittle tool generation, constraining their capability and adaptability to complex scientific tasks. We introduce CASCADE, a self-evolving agentic framework representing an early instantiation of the transition from "LLM + tool use" to "LLM + skill acquisition". CASCADE enables agents to master complex external tools and codify knowledge through two meta-skills: continuous learning via web search and code extraction, and self-reflection via introspection and knowledge graph exploration, among others. We evaluate CASCADE on SciSkillBench, a benchmark of 116 materials science and chemistry research tasks. CASCADE achieves a 93.3% success rate using GPT-5, compared to 35.4% without evolution mechanisms. We further demonstrate real-world applications in computational analysis, autonomous laboratory experiments, and selective reproduction of published papers. Along with human-agent collaboration and memory consolidation, CASCADE accumulates executable skills that can be shared across agents and scientists, moving toward scalable AI-assisted scientific research.
Related papers
- S1-NexusAgent: a Self-Evolving Agent Framework for Multidisciplinary Scientific Research [0.0]
We propose S1-NexusAgent, a self-evolving agent framework for scientific research.<n>S1-NexusAgent adopts a hierarchical Plan-and-CodeAct execution paradigm, decoupling global scientific planning from subtask-level tool execution.<n>S1-NexusAgent achieves state-of-the-art generalization performance, validating its effectiveness and capability in complex scientific tasks.
arXiv Detail & Related papers (2026-02-02T02:33:25Z) - Bohrium + SciMaster: Building the Infrastructure and Ecosystem for Agentic Science at Scale [82.20980951765891]
We argue that scaling agentic science requires an infrastructure-and-ecosystem approach, instantiated Bohrium+SciMaster.<n>Bohrium acts as a managed, traceable hub for AI4S assets that turns diverse scientific data, software, compute, and laboratory systems into agent-ready capabilities.<n>SciMaster orchestrates these capabilities into long-horizon scientific, on which scientific agents can be composed and executed.
arXiv Detail & Related papers (2025-12-23T16:04:41Z) - An Agentic Framework for Autonomous Materials Computation [70.24472585135929]
Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery.<n>Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific experiments.<n>Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations.
arXiv Detail & Related papers (2025-12-22T15:03:57Z) - Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning [84.70211451226835]
Large Language Model (LLM) Agents are constrained by a dependency on human-curated data.<n>We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data.<n>Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks.
arXiv Detail & Related papers (2025-11-20T05:01:57Z) - AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite [75.58737079136942]
We present AstaBench, a suite that provides the first holistic measure of agentic ability to perform scientific research.<n>Our suite comes with the first scientific research environment with production-grade search tools.<n>Our evaluation of 57 agents across 22 agent classes reveals several interesting findings.
arXiv Detail & Related papers (2025-10-24T17:10:26Z) - Democratizing AI scientists using ToolUniverse [32.32301676392716]
In genomics, unified ecosystems have transformed research by enabling interoperability, reuse, and community-driven development.<n>We present ToolUniverse, an ecosystem for building AI scientists from any language or reasoning model across open- and closed-weight models.
arXiv Detail & Related papers (2025-09-27T17:38:53Z) - SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration [39.43814195462455]
SciToolAgent automates hundreds of scientific tools across biology, chemistry, and materials science.<n>The agent also incorporates a comprehensive safety-checking module to ensure responsible and ethical tool usage.
arXiv Detail & Related papers (2025-07-27T13:55:35Z) - SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam? [51.112225746095746]
We introduce X-Master, a tool-augmented reasoning agent designed to emulate human researchers.<n>X-Masters sets a new state-of-the-art record on Humanity's Last Exam with a score of 32.1%.
arXiv Detail & Related papers (2025-07-07T17:50:52Z) - STELLA: Self-Evolving LLM Agent for Biomedical Research [40.841136388072385]
We introduce STELLA, a self-evolving AI agent designed to overcome limitations.<n> STELLA employs a multi-agent architecture that autonomously improves its own capabilities.<n>We demonstrate that STELLA achieves state-of-the-art accuracy on a suite of biomedical benchmarks.
arXiv Detail & Related papers (2025-07-01T20:52:01Z) - Kolb-Based Experiential Learning for Generalist Agents with Human-Level Kaggle Data Science Performance [81.05882480184587]
We propose a computational framework of Kolb's learning cycle with Vygotsky's ZPD for autonomous agents.<n>Agent K is the 1st AI system to successfully integrate Kolb- and Vygotsky-inspired human cognitive learning.<n>With 9 gold, 8 silver, and 12 bronze medals level performance - including 4 gold and 4 silver on prize-awarding competitions - Agent K is the 1st AI system to successfully integrate Kolb- and Vygotsky-inspired human cognitive learning.
arXiv Detail & Related papers (2024-11-05T23:55:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.