An Agentic Framework for Autonomous Materials Computation
- URL: http://arxiv.org/abs/2512.19458v1
- Date: Mon, 22 Dec 2025 15:03:57 GMT
- Title: An Agentic Framework for Autonomous Materials Computation
- Authors: Zeyu Xia, Jinzhe Ma, Congjie Zheng, Shufei Zhang, Yuqiang Li, Hang Su, P. Hu, Changshui Zhang, Xingao Gong, Wanli Ouyang, Lei Bai, Dongzhan Zhou, Mao Su,
- Abstract summary: Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery.<n>Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific experiments.<n>Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations.
- Score: 70.24472585135929
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery, yet their static knowledge and hallucination issues hinder autonomous research applications. Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific workflows. Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations. By embedding domain expertise, the agent ensures physically coherent multi-step workflows and consistently selects convergent, well-posed parameters, thereby enabling reliable end-to-end computational execution. A new benchmark of diverse computational tasks demonstrates that our system significantly outperforms standalone LLMs in both accuracy and robustness. This work establishes a verifiable foundation for autonomous computational experimentation and represents a key step toward fully automated scientific discovery.
Related papers
- QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities [0.7519872646378835]
QUASAR is a universal autonomous system for atomistic simulation designed to facilitate production-grade scientific discovery.<n>We benchmark QUASAR against a series of three-tiered tasks, progressing from routine tasks to frontier research challenges such as photocatalyst screening and novel material assessment.<n>Results suggest that QUASAR can function as a general atomistic reasoning system rather than a task-specific automation framework.
arXiv Detail & Related papers (2026-01-30T05:29:44Z) - Towards Agentic Intelligence for Materials Science [73.4576385477731]
This survey advances a unique pipeline-centric view that spans from corpus curation and pretraining to goal-conditioned agents interfacing with simulation and experimental platforms.<n>To bridge communities and establish a shared frame of reference, we first present an integrated lens that aligns terminology, evaluation, and workflow stages across AI and materials science.
arXiv Detail & Related papers (2026-01-29T23:48:43Z) - A Cloud-based Multi-Agentic Workflow for Science [0.12314765641075438]
Large Language Models (LLMs) become ubiquitous across various scientific domains.<n>Their lack of ability to perform complex tasks like running simulations or to make complex decisions limits their utility.<n>We present a domain-agnostic, model-independent workflow for an agentic framework that can act as a scientific assistant while being run entirely on cloud.
arXiv Detail & Related papers (2026-01-18T22:37:09Z) - SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z) - AutoLabs: Cognitive Multi-Agent Systems with Self-Correction for Autonomous Chemical Experimentation [0.10999592665107412]
AutoLabs is a self-correcting, multi-agent architecture designed to autonomously translate natural-language instructions into executable protocols.<n>We present a comprehensive evaluation framework featuring five benchmark experiments of increasing complexity.<n>Our results demonstrate that agent reasoning capacity is the most critical factor for success.
arXiv Detail & Related papers (2025-09-30T01:51:46Z) - xOffense: An AI-driven autonomous penetration testing framework with offensive knowledge-enhanced LLMs and multi agent systems [0.402058998065435]
xOffense is an AI-driven, multi-agent penetration testing framework.<n>It shifts the process from labor-intensive, expert-driven manual efforts to fully automated, machine-executable scaling seamlessly with computational infrastructure.
arXiv Detail & Related papers (2025-09-16T12:45:45Z) - SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents [93.26456498576181]
This paper focuses on the development of native Autonomous Single-Agent models for Deep Research.<n>Our best variant SFR-DR-20B achieves up to 28.7% on Humanity's Last Exam benchmark.
arXiv Detail & Related papers (2025-09-08T02:07:09Z) - GridMind: LLMs-Powered Agents for Power System Analysis and Operations [3.7568206336846663]
This paper presents a multi-agent AI system that integrates Large Language Models (LLMs) with deterministic engineering solvers to enable conversational scientific computing for power system analysis.<n>GridMind addresses workflow integration, knowledge accessibility, context preservation, and expert decision-support augmentation.<n>This work establishes agentic AI as a viable paradigm for scientific computing, demonstrating how conversational interfaces can enhance accessibility while preserving numerical rigor essential for critical engineering applications.
arXiv Detail & Related papers (2025-09-02T16:42:18Z) - An LLM-enabled Multi-Agent Autonomous Mechatronics Design Framework [49.633199780510864]
This work proposes a multi-agent autonomous mechatronics design framework, integrating expertise across mechanical design, optimization, electronics, and software engineering.<n> operating primarily through a language-driven workflow, the framework incorporates structured human feedback to ensure robust performance under real-world constraints.<n>A fully functional autonomous vessel was developed with optimized propulsion, cost-effective electronics, and advanced control.
arXiv Detail & Related papers (2025-04-20T16:57:45Z) - LLM Agents Making Agent Tools [2.5529148902034637]
Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks.<n>But these tools must be implemented in advance by human developers.<n>We propose ToolMaker, an agentic framework that autonomously transforms papers with code into LLM-compatible tools.
arXiv Detail & Related papers (2025-02-17T11:44:11Z) - Interpreting and Improving Large Language Models in Arithmetic Calculation [72.19753146621429]
Large language models (LLMs) have demonstrated remarkable potential across numerous applications.
In this work, we delve into uncovering a specific mechanism by which LLMs execute calculations.
We investigate the potential benefits of selectively fine-tuning these essential heads/MLPs to boost the LLMs' computational performance.
arXiv Detail & Related papers (2024-09-03T07:01:46Z) - Collaboration Dynamics and Reliability Challenges of Multi-Agent LLM Systems in Finite Element Analysis [3.437656066916039]
How interagent dynamics influence reasoning quality and verification reliability remains unclear.<n>We study these mechanisms using an AutoGen-based multi-agent framework for linear-elastic Finite Element Analysis (FEA)<n>From 1,120 controlled trials, we find that collaboration effectiveness depends more on functional complementarity than team size.
arXiv Detail & Related papers (2024-08-23T23:11:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.