AOT*: Efficient Synthesis Planning via LLM-Empowered AND-OR Tree Search
- URL: http://arxiv.org/abs/2509.20988v1
- Date: Thu, 25 Sep 2025 10:30:37 GMT
- Title: AOT*: Efficient Synthesis Planning via LLM-Empowered AND-OR Tree Search
- Authors: Xiaozhuang Song, Xuanhao Pan, Xinjian Zhao, Hangting Ye, Shufei Zhang, Jian Tang, Tianshu Yu,
- Abstract summary: AOT* is a framework that transforms retrosynthetic planning by integrating LLM-generated chemical synthesis pathways with systematic AND-OR tree search.<n>AOT* exhibits competitive solve rates using 3-5$times$ fewer iterations than existing LLM-based approaches.
- Score: 22.026497456502806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrosynthesis planning enables the discovery of viable synthetic routes for target molecules, playing a crucial role in domains like drug discovery and materials design. Multi-step retrosynthetic planning remains computationally challenging due to exponential search spaces and inference costs. While Large Language Models (LLMs) demonstrate chemical reasoning capabilities, their application to synthesis planning faces constraints on efficiency and cost. To address these challenges, we introduce AOT*, a framework that transforms retrosynthetic planning by integrating LLM-generated chemical synthesis pathways with systematic AND-OR tree search. To this end, AOT* atomically maps the generated complete synthesis routes onto AND-OR tree components, with a mathematically sound design of reward assignment strategy and retrieval-based context engineering, thus enabling LLMs to efficiently navigate in the chemical space. Experimental evaluation on multiple synthesis benchmarks demonstrates that AOT* achieves SOTA performance with significantly improved search efficiency. AOT* exhibits competitive solve rates using 3-5$\times$ fewer iterations than existing LLM-based approaches, with the efficiency advantage becoming more pronounced on complex molecular targets.
Related papers
- When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs [3.973137925060284]
We propose a new benchmarking framework for single-step retrosynthesis.<n>By emphasizing plausibility over exact match, this approach better aligns with human synthesis planning practices.<n>We also introduce CREED, a novel dataset comprising millions of ChemCensor-validated reaction records for LLM training.
arXiv Detail & Related papers (2026-02-03T14:03:32Z) - Synthelite: Chemist-aligned and feasibility-aware synthesis planning with LLMs [3.7129661557601854]
We introduce Synthelite, a synthesis planning framework that uses large language models to propose retrosynthetic transformations.<n> Synthelite can generate end-to-end synthesis routes by harnessing the intrinsic chemical knowledge and reasoning capabilities of LLMs.<n>Our experiments demonstrate that Synthelite can flexibly adapt its planning trajectory to diverse user-specified constraints, achieving up to 95% success rates.
arXiv Detail & Related papers (2025-12-18T11:24:30Z) - ChemOrch: Empowering LLMs with Chemical Intelligence via Synthetic Instructions [52.79349601462865]
ChemOrch is a framework that synthesizes chemically grounded instruction-response pairs.<n>ChemOrch enables controllable diversity and levels of difficulty for the generated tasks.
arXiv Detail & Related papers (2025-09-20T05:43:58Z) - Rethinking Molecule Synthesizability with Chain-of-Reaction [47.744071119775676]
We introduce ReaSyn, a generative framework for synthesizable projection.<n>We propose a novel perspective that views synthetic pathways akin to reasoning paths in large language models (LLMs)<n>With the CoR notation, ReaSyn can get dense supervision in every reaction step to explicitly learn chemical reaction rules.
arXiv Detail & Related papers (2025-09-19T15:29:57Z) - ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z) - LLM-Augmented Chemical Synthesis and Design Decision Programs [18.41721617026997]
We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy.<n>We show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.
arXiv Detail & Related papers (2025-05-11T15:43:00Z) - A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning [9.502407569651321]
MHNpath is a machine learning-driven retrosynthetic tool for computer-aided synthesis planning.<n>We demonstrate its effectiveness through case studies involving complex molecules from ChemByDesign.<n>Our case studies reveal that the tool can generate shorter, cheaper, moderate-temperature routes employing green solvents.
arXiv Detail & Related papers (2025-04-03T00:23:21Z) - Automated Retrosynthesis Planning of Macromolecules Using Large Language Models and Knowledge Graphs [11.191853171170516]
We propose an agent system that integrates large language models (LLMs) and knowledge graphs.<n>Our system fully automates the retrieval of relevant literatures, extraction of reaction data, database querying, construction of retrosynthetic pathway trees.<n>This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs.
arXiv Detail & Related papers (2025-01-15T16:06:10Z) - Tango*: Constrained synthesis planning using chemically informed value functions [1.6787839854263589]
We introduce a simple guided search which allows solving the starting material-constrained synthesis planning problem.<n>We find the Tango* cost function catalyses strong improvements for the bidirectional DESP methods.<n>Our method achieves lower wall clock times while proposing synthetic routes of similar length, a common metric for route quality.
arXiv Detail & Related papers (2024-12-04T16:14:02Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - ChemMiner: A Large Language Model Agent System for Chemical Literature Data Mining [56.15126714863963]
ChemMiner is an end-to-end framework for extracting chemical data from literature.<n>ChemMiner incorporates three specialized agents: a text analysis agent for coreference mapping, a multimodal agent for non-textual information extraction, and a synthesis analysis agent for data generation.<n> Experimental results demonstrate reaction identification rates comparable to human chemists while significantly reducing processing time, with high accuracy, recall, and F1 scores.
arXiv Detail & Related papers (2024-02-20T13:21:46Z) - FusionRetro: Molecule Representation Fusion via In-Context Learning for
Retrosynthetic Planning [58.47265392465442]
Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule.
Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms.
We propose a novel framework that utilizes context information for improved retrosynthetic planning.
arXiv Detail & Related papers (2022-09-30T08:44:58Z) - Retrosynthetic Planning with Experience-Guided Monte Carlo Tree Search [10.67810457039541]
In retrosynthetic planning, the huge number of possible routes to synthesize a complex molecule leads to an explosion of possibilities.
Current approaches rely on human-defined or machine-trained score functions which have limited chemical knowledge.
We build an experience guidance network to learn knowledge from synthetic experiences during the search.
arXiv Detail & Related papers (2021-12-11T17:14:15Z) - Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search [83.22850633478302]
Retrosynthetic planning identifies a series of reactions that can lead to the synthesis of a target product.
Existing methods either require expensive return estimation by rollout with high variance, or optimize for search speed rather than the quality.
We propose Retro*, a neural-based A*-like algorithm that finds high-quality synthetic routes efficiently.
arXiv Detail & Related papers (2020-06-29T05:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.