ASKCOS: an open source software suite for synthesis planning
- URL: http://arxiv.org/abs/2501.01835v1
- Date: Fri, 03 Jan 2025 14:38:03 GMT
- Title: ASKCOS: an open source software suite for synthesis planning
- Authors: Zhengkai Tu, Sourabh J. Choure, Mun Hong Fong, Jihye Roh, Itai Levin, Kevin Yu, Joonyoung F. Joung, Nathan Morgan, Shih-Cheng Li, Xiaoqi Sun, Huiqian Lin, Mark Murnin, Jordan P. Liles, Thomas J. Struble, Michael E. Fortunato, Mengjie Liu, William H. Green, Klavs F. Jensen, Connor W. Coley,
- Abstract summary: We detail the newest version of ASKCOS, an open source software suite for synthesis planning.<n>Four one-step retrosynthesis models form the basis of both interactive planning and automatic planning modes.<n>It is our belief that CASP tools like ASKCOS are an important part of modern chemistry research.
- Score: 7.245299433003954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advancement of machine learning and the availability of large-scale reaction datasets have accelerated the development of data-driven models for computer-aided synthesis planning (CASP) in the past decade. Here, we detail the newest version of ASKCOS, an open source software suite for synthesis planning that makes available several research advances in a freely available, practical tool. Four one-step retrosynthesis models form the basis of both interactive planning and automatic planning modes. Retrosynthetic planning is complemented by other modules for feasibility assessment and pathway evaluation, including reaction condition recommendation, reaction outcome prediction, and auxiliary capabilities such as solubility prediction and quantum mechanical descriptor prediction. ASKCOS has assisted hundreds of medicinal, synthetic, and process chemists in their day-to-day tasks, complementing expert decision making. It is our belief that CASP tools like ASKCOS are an important part of modern chemistry research, and that they offer ever-increasing utility and accessibility.
Related papers
- Large Language Models Transform Organic Synthesis From Reaction Prediction to Automation [3.904238958136483]
Large language models (LLMs) are beginning to reshape how chemists plan and run reactions in organic synthesis.<n>LLMs can propose synthetic routes, forecast reaction outcomes and instruct robots that execute experiments without human supervision.<n>We show how coupling LLMs with graph neural networks, quantum calculations and real-time spectroscopy shrinks discovery cycles and supports greener, data-driven chemistry.
arXiv Detail & Related papers (2025-08-07T14:17:23Z) - Fast and scalable retrosynthetic planning with a transformer neural network and speculative beam search [25.069344340760715]
We propose a method for accelerating multi-step synthesis planning systems that rely on SMILES-to-SMILES transformers as single-step retrosynthesis models.<n>Our approach reduces the latency of SMILES-to-SMILES transformers powering multi-step synthesis planning in AiZynthFinder through speculative beam search combined with a scalable drafting strategy called Medusa.
arXiv Detail & Related papers (2025-08-02T18:30:06Z) - WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization [68.46693401421923]
WebShaper systematically formalizes IS tasks through set theory.<n>WebShaper achieves state-of-the-art performance among open-sourced IS agents on GAIA and WebWalkerQA benchmarks.
arXiv Detail & Related papers (2025-07-20T17:53:37Z) - ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z) - A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning [9.502407569651321]
MHNpath is a machine learning-driven retrosynthetic tool for computer-aided synthesis planning.
We demonstrate its effectiveness through case studies involving complex molecules from ChemByDesign.
Our case studies reveal that the tool can generate shorter, cheaper, moderate-temperature routes employing green solvents.
arXiv Detail & Related papers (2025-04-03T00:23:21Z) - Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge [6.500470477634259]
Our work aims to support the materials science community by providing a practical, data-driven resource.
We have curated a comprehensive dataset of 17K expert-verified synthesis recipes from open-access literature.
AlchemicalBench offers an end-to-end framework that supports research in large language models applied to synthesis prediction.
arXiv Detail & Related papers (2025-02-23T06:16:23Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - Double-Ended Synthesis Planning with Goal-Constrained Bidirectional Search [27.09693306892583]
We present a formulation of synthesis planning with starting material constraints.
We propose Double-Ended Synthesis Planning (DESP), a novel CASP algorithm under a bidirectional graph search scheme.
DESP can make use of existing one-step retrosynthesis models, and we anticipate its performance to scale as these one-step model capabilities improve.
arXiv Detail & Related papers (2024-07-08T18:56:00Z) - An Autonomous Large Language Model Agent for Chemical Literature Data
Mining [60.85177362167166]
We introduce an end-to-end AI agent framework capable of high-fidelity extraction from extensive chemical literature.
Our framework's efficacy is evaluated using accuracy, recall, and F1 score of reaction condition data.
arXiv Detail & Related papers (2024-02-20T13:21:46Z) - Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis [57.70772230913099]
Chemist-X automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology.
Chemist-X interrogates online molecular databases and distills critical data from the latest literature database.
Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems.
arXiv Detail & Related papers (2023-11-16T01:21:33Z) - FusionRetro: Molecule Representation Fusion via In-Context Learning for
Retrosynthetic Planning [58.47265392465442]
Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule.
Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms.
We propose a novel framework that utilizes context information for improved retrosynthetic planning.
arXiv Detail & Related papers (2022-09-30T08:44:58Z) - SynKB: Semantic Search for Synthetic Procedures [9.360528362635215]
We present SynKB, an open-source, automatically extracted knowledge base of chemical synthesis protocols.
Similar to proprietary chemistry databases such as Reaxsys, SynKB allows chemists to retrieve structured knowledge about synthetic procedures.
arXiv Detail & Related papers (2022-08-15T18:33:16Z) - Latent Execution for Neural Program Synthesis Beyond Domain-Specific
Languages [97.58968222942173]
We take the first step to synthesize C programs from input-output examples.
In particular, we propose La Synth, which learns the latent representation to approximate the execution of partially generated programs.
We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis.
arXiv Detail & Related papers (2021-06-29T02:21:32Z) - Toward Neural-Network-Guided Program Synthesis and Verification [26.706421573322952]
We propose a novel framework of program and invariant synthesis called neural network-guided synthesis.
We first show that, by designing and training neural networks, we can extract logical formulas over integers from the weights and biases of the trained neural networks.
Based on the idea, we have implemented a tool to synthesize formulas from positive/negative examples and implication constraints.
arXiv Detail & Related papers (2021-03-17T03:09:05Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Predictive Synthesis of Quantum Materials by Probabilistic Reinforcement
Learning [1.4680035572775534]
We use reinforcement learning to predict optimal synthesis schedules for a prototypical quantum material, semiconducting monolayer MoS$_2$.
The model can be extended to predict profiles for synthesis of complex structures including multi-phase heterostructures.
arXiv Detail & Related papers (2020-09-14T20:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.