Related papers: CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution

CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution

URL: http://arxiv.org/abs/2512.23880v1
Date: Mon, 29 Dec 2025 21:50:23 GMT
Title: CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution
Authors: Xu Huang, Junwu Chen, Yuxing Fei, Zhuohan Li, Philippe Schwaller, Gerbrand Ceder,
Abstract summary: Large language model (LLM) agents currently depend on predefined tools or brittle tool generation.<n>We introduce CASCADE, a self-evolving agentic framework representing an early instantiation of the transition from "LLM + tool use" to "LLM + skill acquisition"
Score: 7.266404572341558
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM) agents currently depend on predefined tools or brittle tool generation, constraining their capability and adaptability to complex scientific tasks. We introduce CASCADE, a self-evolving agentic framework representing an early instantiation of the transition from "LLM + tool use" to "LLM + skill acquisition". CASCADE enables agents to master complex external tools and codify knowledge through two meta-skills: continuous learning via web search and code extraction, and self-reflection via introspection and knowledge graph exploration, among others. We evaluate CASCADE on SciSkillBench, a benchmark of 116 materials science and chemistry research tasks. CASCADE achieves a 93.3% success rate using GPT-5, compared to 35.4% without evolution mechanisms. We further demonstrate real-world applications in computational analysis, autonomous laboratory experiments, and selective reproduction of published papers. Along with human-agent collaboration and memory consolidation, CASCADE accumulates executable skills that can be shared across agents and scientists, moving toward scalable AI-assisted scientific research.

Related papers

S1-NexusAgent: a Self-Evolving Agent Framework for Multidisciplinary Scientific Research [0.0]
We propose S1-NexusAgent, a self-evolving agent framework for scientific research.<n>S1-NexusAgent adopts a hierarchical Plan-and-CodeAct execution paradigm, decoupling global scientific planning from subtask-level tool execution.<n>S1-NexusAgent achieves state-of-the-art generalization performance, validating its effectiveness and capability in complex scientific tasks.
arXiv Detail & Related papers (2026-02-02T02:33:25Z)
Bohrium + SciMaster: Building the Infrastructure and Ecosystem for Agentic Science at Scale [82.20980951765891]
We argue that scaling agentic science requires an infrastructure-and-ecosystem approach, instantiated Bohrium+SciMaster.<n>Bohrium acts as a managed, traceable hub for AI4S assets that turns diverse scientific data, software, compute, and laboratory systems into agent-ready capabilities.<n>SciMaster orchestrates these capabilities into long-horizon scientific, on which scientific agents can be composed and executed.
arXiv Detail & Related papers (2025-12-23T16:04:41Z)
An Agentic Framework for Autonomous Materials Computation [70.24472585135929]
Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery.<n>Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific experiments.<n>Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations.
arXiv Detail & Related papers (2025-12-22T15:03:57Z)
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning [84.70211451226835]
Large Language Model (LLM) Agents are constrained by a dependency on human-curated data.<n>We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data.<n>Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks.
arXiv Detail & Related papers (2025-11-20T05:01:57Z)
AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite [75.58737079136942]
We present AstaBench, a suite that provides the first holistic measure of agentic ability to perform scientific research.<n>Our suite comes with the first scientific research environment with production-grade search tools.<n>Our evaluation of 57 agents across 22 agent classes reveals several interesting findings.
arXiv Detail & Related papers (2025-10-24T17:10:26Z)
Democratizing AI scientists using ToolUniverse [32.32301676392716]
In genomics, unified ecosystems have transformed research by enabling interoperability, reuse, and community-driven development.<n>We present ToolUniverse, an ecosystem for building AI scientists from any language or reasoning model across open- and closed-weight models.
arXiv Detail & Related papers (2025-09-27T17:38:53Z)
SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration [39.43814195462455]
SciToolAgent automates hundreds of scientific tools across biology, chemistry, and materials science.<n>The agent also incorporates a comprehensive safety-checking module to ensure responsible and ethical tool usage.
arXiv Detail & Related papers (2025-07-27T13:55:35Z)
SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam? [51.112225746095746]
We introduce X-Master, a tool-augmented reasoning agent designed to emulate human researchers.<n>X-Masters sets a new state-of-the-art record on Humanity's Last Exam with a score of 32.1%.
arXiv Detail & Related papers (2025-07-07T17:50:52Z)
STELLA: Self-Evolving LLM Agent for Biomedical Research [40.841136388072385]
We introduce STELLA, a self-evolving AI agent designed to overcome limitations.<n> STELLA employs a multi-agent architecture that autonomously improves its own capabilities.<n>We demonstrate that STELLA achieves state-of-the-art accuracy on a suite of biomedical benchmarks.
arXiv Detail & Related papers (2025-07-01T20:52:01Z)
Kolb-Based Experiential Learning for Generalist Agents with Human-Level Kaggle Data Science Performance [81.05882480184587]
We propose a computational framework of Kolb's learning cycle with Vygotsky's ZPD for autonomous agents.<n>Agent K is the 1st AI system to successfully integrate Kolb- and Vygotsky-inspired human cognitive learning.<n>With 9 gold, 8 silver, and 12 bronze medals level performance - including 4 gold and 4 silver on prize-awarding competitions - Agent K is the 1st AI system to successfully integrate Kolb- and Vygotsky-inspired human cognitive learning.
arXiv Detail & Related papers (2024-11-05T23:55:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.