Toward Greater Autonomy in Materials Discovery Agents: Unifying Planning, Physics, and Scientists
- URL: http://arxiv.org/abs/2506.05616v2
- Date: Mon, 09 Jun 2025 17:27:38 GMT
- Title: Toward Greater Autonomy in Materials Discovery Agents: Unifying Planning, Physics, and Scientists
- Authors: Lianhao Zhou, Hongyi Ling, Keqiang Yan, Kaiji Zhao, Xiaoning Qian, Raymundo Arróyave, Xiaofeng Qian, Shuiwang Ji,
- Abstract summary: MAPPS consists of a Planner, a Tool Code Generator, and a Scientific Mediator.<n>By unifying planning, physics, and scientists, MAPPS enables flexible and reliable materials discovery with greater autonomy.
- Score: 46.884317494606776
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We aim at designing language agents with greater autonomy for crystal materials discovery. While most of existing studies restrict the agents to perform specific tasks within predefined workflows, we aim to automate workflow planning given high-level goals and scientist intuition. To this end, we propose Materials Agent unifying Planning, Physics, and Scientists, known as MAPPS. MAPPS consists of a Workflow Planner, a Tool Code Generator, and a Scientific Mediator. The Workflow Planner uses large language models (LLMs) to generate structured and multi-step workflows. The Tool Code Generator synthesizes executable Python code for various tasks, including invoking a force field foundation model that encodes physics. The Scientific Mediator coordinates communications, facilitates scientist feedback, and ensures robustness through error reflection and recovery. By unifying planning, physics, and scientists, MAPPS enables flexible and reliable materials discovery with greater autonomy, achieving a five-fold improvement in stability, uniqueness, and novelty rates compared with prior generative models when evaluated on the MP-20 data. We provide extensive experiments across diverse tasks to show that MAPPS is a promising framework for autonomous materials discovery.
Related papers
- A Survey of AI for Materials Science: Foundation Models, LLM Agents, Datasets, and Tools [15.928285656168422]
Foundation models (FMs) are enabling scalable, general-purpose, and multimodal AI systems for scientific discovery.<n>This survey provides a comprehensive overview of foundation models, agentic systems, datasets, and computational tools supporting this growing field.
arXiv Detail & Related papers (2025-06-25T18:10:30Z) - ChemGraph: An Agentic Framework for Computational Chemistry Workflows [0.0]
ChemGraph is an agentic framework powered by artificial intelligence and state-of-the-art simulation tools.<n>Users can perform tasks such as molecular structure generation, single-point energy, geometry optimization, vibrational analysis, and thermochemistry calculations.
arXiv Detail & Related papers (2025-06-03T21:11:56Z) - LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback [121.78866929908871]
Large Action Models (LAMs) for AI Agents offer incredible potential but face challenges due to the need for high-quality training data.<n>We present LAM SIMULATOR, a comprehensive framework designed for online exploration of agentic tasks with high-quality feedback.<n>Our framework features a dynamic task query generator, an extensive collection of tools, and an interactive environment where Large Language Model (LLM) Agents can call tools and receive real-time feedback.
arXiv Detail & Related papers (2025-06-02T22:36:02Z) - ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows [82.07367406991678]
Large Language Models (LLMs) have extended their impact beyond Natural Language Processing.<n>Among these, computer-using agents are capable of interacting with operating systems as humans do.<n>We introduce ScienceBoard, which encompasses a realistic, multi-domain environment featuring dynamic and visually rich scientific software.
arXiv Detail & Related papers (2025-05-26T12:27:27Z) - LLM Agents Making Agent Tools [2.5529148902034637]
Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks.<n>But these tools must be implemented in advance by human developers.<n>We propose ToolMaker, an agentic framework that autonomously transforms papers with code into LLM-compatible tools.
arXiv Detail & Related papers (2025-02-17T11:44:11Z) - MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science [62.96434290874878]
Current Multi-Modal Large Language Models (MLLM) have shown strong capabilities in general visual reasoning tasks.<n>We develop a new framework, named Multi-Modal Scientific Reasoning with Physics Perception and Simulation (MAPS) based on an MLLM.<n>MAPS decomposes expert-level multi-modal reasoning task into physical diagram understanding via a Physical Perception Model (PPM) and reasoning with physical knowledge via a simulator.
arXiv Detail & Related papers (2025-01-18T13:54:00Z) - Autonomous Microscopy Experiments through Large Language Model Agents [4.241267255764773]
Large language models (LLMs) have accelerated the development of self-driving laboratories (SDLs) for materials research.<n>Here, we introduce AILA (Artificially Intelligent Lab Assistant), a framework that automates atomic force microscopy (AFM) through LLM-driven agents.<n>Our systematic assessment shows that state-of-the-art language models struggle even with basic tasks such as documentation retrieval.
arXiv Detail & Related papers (2024-12-18T09:35:28Z) - MatPilot: an LLM-enabled AI Materials Scientist under the Framework of Human-Machine Collaboration [13.689620109856783]
We developed an AI materials scientist named MatPilot, which has shown encouraging abilities in the discovery of new materials.
The core strength of MatPilot is its natural language interactive human-machine collaboration.
MatPilot integrates unique cognitive abilities, extensive accumulated experience, and ongoing curiosity of human-beings.
arXiv Detail & Related papers (2024-11-10T12:23:44Z) - PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents [6.6004056020499355]
In practice, obtaining high-quality ptychographic images requires simultaneous optimization of numerous experimental and algorithmic parameters.
In this work, we develop a framework that leverages large language models (LLMs) to automate data analysis in ptychography.
Our study demonstrates that PEAR's multi-agent design significantly improves the workflow success rate, even with smaller open-weight models.
arXiv Detail & Related papers (2024-10-11T17:50:59Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - LLMatDesign: Autonomous Materials Discovery with Large Language Models [5.481299708562135]
New materials can have significant scientific and technological implications.
Recent advances in machine learning have enabled data-driven methods to rapidly screen or generate promising materials.
We introduce LLMatDesign, a novel framework for interpretable materials design powered by large language models.
arXiv Detail & Related papers (2024-06-19T02:35:02Z) - Octopus: Embodied Vision-Language Programmer from Environmental Feedback [58.04529328728999]
Embodied vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
To bridge this gap, we introduce Octopus, an embodied vision-language programmer that uses executable code generation as a medium to connect planning and manipulation.
Octopus is designed to 1) proficiently comprehend an agent's visual and textual task objectives, 2) formulate intricate action sequences, and 3) generate executable code.
arXiv Detail & Related papers (2023-10-12T17:59:58Z) - MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation [96.71370747681078]
We introduce MLAgentBench, a suite of 13 tasks ranging from improving model performance on CIFAR-10 to recent research problems like BabyLM.
For each task, an agent can perform actions like reading/writing files, executing code, and inspecting outputs.
We benchmark agents based on Claude v1.0, Claude v2.1, Claude v3 Opus, GPT-4, GPT-4-turbo, Gemini-Pro, and Mixtral and find that a Claude v3 Opus agent is the best in terms of success rate.
arXiv Detail & Related papers (2023-10-05T04:06:12Z) - pymdp: A Python library for active inference in discrete state spaces [52.85819390191516]
pymdp is an open-source package for simulating active inference in Python.
We provide the first open-source package for simulating active inference with POMDPs.
arXiv Detail & Related papers (2022-01-11T12:18:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.