MDCrow: Automating Molecular Dynamics Workflows with Large Language Models
- URL: http://arxiv.org/abs/2502.09565v1
- Date: Thu, 13 Feb 2025 18:19:20 GMT
- Title: MDCrow: Automating Molecular Dynamics Workflows with Large Language Models
- Authors: Quintina Campbell, Sam Cox, Jorge Medina, Brittany Watterson, Andrew D. White,
- Abstract summary: We introduce MDCrow, an agentic LLM assistant capable of automating Molecular dynamics simulations.
We assess MDCrow's performance across 25 tasks of varying required subtasks and difficulty, and we evaluate the agent's robustness to both difficulty and prompt style.
- Score: 0.6130124744675498
- License:
- Abstract: Molecular dynamics (MD) simulations are essential for understanding biomolecular systems but remain challenging to automate. Recent advances in large language models (LLM) have demonstrated success in automating complex scientific tasks using LLM-based agents. In this paper, we introduce MDCrow, an agentic LLM assistant capable of automating MD workflows. MDCrow uses chain-of-thought over 40 expert-designed tools for handling and processing files, setting up simulations, analyzing the simulation outputs, and retrieving relevant information from literature and databases. We assess MDCrow's performance across 25 tasks of varying required subtasks and difficulty, and we evaluate the agent's robustness to both difficulty and prompt style. \texttt{gpt-4o} is able to complete complex tasks with low variance, followed closely by \texttt{llama3-405b}, a compelling open-source model. While prompt style does not influence the best models' performance, it has significant effects on smaller models.
Related papers
- Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset.
We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.
Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z) - MDCure: A Scalable Pipeline for Multi-Document Instruction-Following [40.201087646516335]
We introduce MDCure, a scalable and effective fine-tuning pipeline to enhance the MD capabilities of LLMs.
MDCure is based on generation of high-quality synthetic MD instruction data from sets of related articles via targeted prompts.
We also introduce MDCureRM, a multi-objective reward model which filters generated data based on their training utility for MD settings.
arXiv Detail & Related papers (2024-10-30T21:08:07Z) - MetaOpenFOAM: an LLM-based multi-agent framework for CFD [11.508919041921942]
MetaOpenFOAM is a novel multi-agent collaborations framework.
It aims to complete CFD simulation tasks with only natural language as input.
It harnesses the power of MetaGPT's assembly line paradigm.
arXiv Detail & Related papers (2024-07-31T04:01:08Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning [56.82041895921434]
Open-source pre-trained Large Language Models (LLMs) exhibit strong language understanding and generation capabilities.
When used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4.
arXiv Detail & Related papers (2024-03-29T03:48:12Z) - MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization [86.61052121715689]
MatPlotAgent is a model-agnostic framework designed to automate scientific data visualization tasks.
MatPlotBench is a high-quality benchmark consisting of 100 human-verified test cases.
arXiv Detail & Related papers (2024-02-18T04:28:28Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - AutoML-GPT: Automatic Machine Learning with GPT [74.30699827690596]
We propose developing task-oriented prompts and automatically utilizing large language models (LLMs) to automate the training pipeline.
We present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyper parameters.
This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas.
arXiv Detail & Related papers (2023-05-04T02:09:43Z) - Automatic Componentwise Boosting: An Interpretable AutoML System [1.1709030738577393]
We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm.
Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions.
Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets.
arXiv Detail & Related papers (2021-09-12T18:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.