ChemCrow: Augmenting large-language models with chemistry tools
- URL: http://arxiv.org/abs/2304.05376v5
- Date: Mon, 2 Oct 2023 17:03:01 GMT
- Title: ChemCrow: Augmenting large-language models with chemistry tools
- Authors: Andres M Bran, Sam Cox, Oliver Schilter, Carlo Baldassari, Andrew D
White, Philippe Schwaller
- Abstract summary: Large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems.
In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design.
Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore.
- Score: 0.9195187117013247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last decades, excellent computational chemistry tools have been
developed. Integrating them into a single platform with enhanced accessibility
could help reaching their full potential by overcoming steep learning curves.
Recently, large-language models (LLMs) have shown strong performance in tasks
across domains, but struggle with chemistry-related problems. Moreover, these
models lack access to external knowledge sources, limiting their usefulness in
scientific applications. In this study, we introduce ChemCrow, an LLM chemistry
agent designed to accomplish tasks across organic synthesis, drug discovery,
and materials design. By integrating 18 expert-designed tools, ChemCrow
augments the LLM performance in chemistry, and new capabilities emerge. Our
agent autonomously planned and executed the syntheses of an insect repellent,
three organocatalysts, and guided the discovery of a novel chromophore. Our
evaluation, including both LLM and expert assessments, demonstrates ChemCrow's
effectiveness in automating a diverse set of chemical tasks. Surprisingly, we
find that GPT-4 as an evaluator cannot distinguish between clearly wrong GPT-4
completions and Chemcrow's performance. Our work not only aids expert chemists
and lowers barriers for non-experts, but also fosters scientific advancement by
bridging the gap between experimental and computational chemistry.
Related papers
- A Review of Large Language Models and Autonomous Agents in Chemistry [0.7184549921674758]
Large language models (LLMs) are emerging as a powerful tool in chemistry across multiple domains.
A core emerging idea is combining LLMs with chemistry-specific tools like synthesis planners and databases, leading to so-called "agents"
An emerging direction is the development of multi-agent systems using a human-in-the-loop approach.
arXiv Detail & Related papers (2024-06-26T17:33:21Z) - Are large language models superhuman chemists? [5.1611032009738205]
"ChemBench" is an automated framework designed to rigorously evaluate the chemical knowledge and reasoning abilities of state-of-the-art models.
We curated more than 7,000 question-answer pairs for a wide array of subfields of the chemical sciences.
We found that the best models outperformed the best human chemists in our study on average.
arXiv Detail & Related papers (2024-04-01T20:56:25Z) - An Autonomous Large Language Model Agent for Chemical Literature Data
Mining [60.85177362167166]
We introduce an end-to-end AI agent framework capable of high-fidelity extraction from extensive chemical literature.
Our framework's efficacy is evaluated using accuracy, recall, and F1 score of reaction condition data.
arXiv Detail & Related papers (2024-02-20T13:21:46Z) - LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset [13.063678216852473]
We show that large language models (LLMs) can achieve very strong results on a comprehensive set of chemistry tasks.
We propose SMolInstruct, a large-scale, comprehensive, and high-quality dataset for instruction tuning.
Using SMolInstruct, we fine-tune a set of open-source LLMs, among which, we find that Mistral serves as the best base model for chemistry tasks.
arXiv Detail & Related papers (2024-02-14T18:42:25Z) - ChemLLM: A Chemical Large Language Model [49.308528569982805]
Large language models (LLMs) have made impressive progress in chemistry applications.
However, the community lacks an LLM specifically designed for chemistry.
Here, we introduce ChemLLM, a comprehensive framework that features the first LLM dedicated to chemistry.
arXiv Detail & Related papers (2024-02-10T01:11:59Z) - Structured Chemistry Reasoning with Large Language Models [70.13959639460015]
Large Language Models (LLMs) excel in diverse areas, yet struggle with complex scientific reasoning, especially in chemistry.
We introduce StructChem, a simple yet effective prompting strategy that offers the desired guidance and substantially boosts the LLMs' chemical reasoning capability.
Tests across four chemistry areas -- quantum chemistry, mechanics, physical chemistry, and kinetics -- StructChem substantially enhances GPT-4's performance, with up to 30% peak improvement.
arXiv Detail & Related papers (2023-11-16T08:20:36Z) - Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis [57.70772230913099]
Chemist-X automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology.
Chemist-X interrogates online molecular databases and distills critical data from the latest literature database.
Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems.
arXiv Detail & Related papers (2023-11-16T01:21:33Z) - Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective [53.300288393173204]
Large Language Models (LLMs) have shown remarkable performance in various cross-modal tasks.
In this work, we propose an In-context Few-Shot Molecule Learning paradigm for molecule-caption translation.
We evaluate the effectiveness of MolReGPT on molecule-caption translation, including molecule understanding and text-based molecule generation.
arXiv Detail & Related papers (2023-06-11T08:16:25Z) - What can Large Language Models do in chemistry? A comprehensive
benchmark on eight tasks [41.9830989458936]
Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged.
We aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain.
arXiv Detail & Related papers (2023-05-27T14:17:33Z) - Federated Learning of Molecular Properties in a Heterogeneous Setting [79.00211946597845]
We introduce federated heterogeneous molecular learning to address these challenges.
Federated learning allows end-users to build a global model collaboratively while preserving the training data distributed over isolated clients.
FedChem should enable a new type of collaboration for improving AI in chemistry that mitigates concerns about valuable chemical data.
arXiv Detail & Related papers (2021-09-15T12:49:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.