Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis
- URL: http://arxiv.org/abs/2311.10776v5
- Date: Thu, 4 Apr 2024 10:57:56 GMT
- Title: Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis
- Authors: Kexin Chen, Junyou Li, Kunyi Wang, Yuyang Du, Jiahui Yu, Jiamin Lu, Lanqing Li, Jiezhong Qiu, Jianzhang Pan, Yi Huang, Qun Fang, Pheng Ann Heng, Guangyong Chen,
- Abstract summary: Chemist-X automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology.
Chemist-X interrogates online molecular databases and distills critical data from the latest literature database.
Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems.
- Score: 57.70772230913099
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent AI research plots a promising future of automatic chemical reactions within the chemistry society. This study proposes Chemist-X, a transformative AI agent that automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology. To emulate expert chemists' strategies when solving RCR tasks, Chemist-X utilizes advanced RAG schemes to interrogate online molecular databases and distill critical data from the latest literature database. Further, the agent leverages state-of-the-art computer-aided design (CAD) tools with a large language model (LLM) supervised programming interface. With the ability to utilize updated chemical knowledge and CAD tools, our agent significantly outperforms conventional synthesis AIs confined to the fixed knowledge within its training data. Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems, thereby bringing closer computational techniques and chemical research and making a remarkable leap toward harnessing AI's full capabilities in scientific discovery.
Related papers
- Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation [50.639325453203504]
MM-RCR is a text-augmented multimodal LLM that learns a unified reaction representation from SMILES, reaction graphs, and textual corpus for chemical reaction recommendation (RCR)
Our results demonstrate that MM-RCR achieves state-of-the-art performance on two open benchmark datasets.
arXiv Detail & Related papers (2024-07-21T12:27:26Z) - A Review of Large Language Models and Autonomous Agents in Chemistry [0.7184549921674758]
Large language models (LLMs) are emerging as a powerful tool in chemistry across multiple domains.
A core emerging idea is combining LLMs with chemistry-specific tools like synthesis planners and databases, leading to so-called "agents"
An emerging direction is the development of multi-agent systems using a human-in-the-loop approach.
arXiv Detail & Related papers (2024-06-26T17:33:21Z) - Contextual Molecule Representation Learning from Chemical Reaction
Knowledge [24.501564702095937]
We introduce REMO, a self-supervised learning framework that takes advantage of well-defined atom-combination rules in common chemistry.
REMO pre-trains graph/Transformer encoders on 1.7 million known chemical reactions in the literature.
arXiv Detail & Related papers (2024-02-21T12:58:40Z) - An Autonomous Large Language Model Agent for Chemical Literature Data
Mining [60.85177362167166]
We introduce an end-to-end AI agent framework capable of high-fidelity extraction from extensive chemical literature.
Our framework's efficacy is evaluated using accuracy, recall, and F1 score of reaction condition data.
arXiv Detail & Related papers (2024-02-20T13:21:46Z) - ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision [27.850325653751078]
structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design.
Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts.
We propose ReactIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions.
arXiv Detail & Related papers (2023-07-04T02:52:30Z) - ChemCrow: Augmenting large-language models with chemistry tools [0.9195187117013247]
Large-language models (LLMs) have shown strong performance in tasks across domains, but struggle with chemistry-related problems.
In this study, we introduce ChemCrow, an LLM chemistry agent designed to accomplish tasks across organic synthesis, drug discovery, and materials design.
Our agent autonomously planned and executed the syntheses of an insect repellent, three organocatalysts, and guided the discovery of a novel chromophore.
arXiv Detail & Related papers (2023-04-11T17:41:13Z) - ChemVise: Maximizing Out-of-Distribution Chemical Detection with the
Novel Application of Zero-Shot Learning [60.02503434201552]
This research proposes learning approximations of complex exposures from training sets of simple ones.
We demonstrate this approach to synthetic sensor responses surprisingly improves the detection of out-of-distribution obscured chemical analytes.
arXiv Detail & Related papers (2023-02-09T20:19:57Z) - Improving Molecular Representation Learning with Metric
Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems.
MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.