DeepRetro: Retrosynthetic Pathway Discovery using Iterative LLM Reasoning
- URL: http://arxiv.org/abs/2507.07060v1
- Date: Mon, 07 Jul 2025 19:41:39 GMT
- Title: DeepRetro: Retrosynthetic Pathway Discovery using Iterative LLM Reasoning
- Authors: Shreyas Vinaya Sathyanarayana, Rahil Shah, Sharanabasava D. Hiremath, Rishikesh Panda, Rahul Jana, Riya Singh, Rida Irfan, Ashwin Murali, Bharath Ramsundar,
- Abstract summary: DeepRetro is an open-source, iterative, hybrid LLM-based retrosynthetic framework.<n>Our approach integrates the strengths of conventional template-based/Monte Carlo tree search tools with the generative power of LLMs in a step-wise, feedback-driven loop.<n>This approach successfully generates novel pathways for complex natural product compounds.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrosynthesis, the identification of precursor molecules for a target compound, is pivotal for synthesizing complex molecules, but faces challenges in discovering novel pathways beyond predefined templates. Recent large language model (LLM) approaches to retrosynthesis have shown promise but effectively harnessing LLM reasoning capabilities for effective multi-step planning remains an open question. To address this challenge, we introduce DeepRetro, an open-source, iterative, hybrid LLM-based retrosynthetic framework. Our approach integrates the strengths of conventional template-based/Monte Carlo tree search tools with the generative power of LLMs in a step-wise, feedback-driven loop. Initially, synthesis planning is attempted with a template-based engine. If this fails, the LLM subsequently proposes single-step retrosynthetic disconnections. Crucially, these suggestions undergo rigorous validity, stability, and hallucination checks before the resulting precursors are recursively fed back into the pipeline for further evaluation. This iterative refinement allows for dynamic pathway exploration and correction. We demonstrate the potential of this pipeline through benchmark evaluations and case studies, showcasing its ability to identify viable and potentially novel retrosynthetic routes. In particular, we develop an interactive graphical user interface that allows expert human chemists to provide human-in-the-loop feedback to the reasoning algorithm. This approach successfully generates novel pathways for complex natural product compounds, demonstrating the potential for iterative LLM reasoning to advance state-of-art in complex chemical syntheses.
Related papers
- Large Language Models Transform Organic Synthesis From Reaction Prediction to Automation [3.904238958136483]
Large language models (LLMs) are beginning to reshape how chemists plan and run reactions in organic synthesis.<n>LLMs can propose synthetic routes, forecast reaction outcomes and instruct robots that execute experiments without human supervision.<n>We show how coupling LLMs with graph neural networks, quantum calculations and real-time spectroscopy shrinks discovery cycles and supports greener, data-driven chemistry.
arXiv Detail & Related papers (2025-08-07T14:17:23Z) - ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z) - LLM-Augmented Chemical Synthesis and Design Decision Programs [18.41721617026997]
We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy.<n>We show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.
arXiv Detail & Related papers (2025-05-11T15:43:00Z) - Automated Retrosynthesis Planning of Macromolecules Using Large Language Models and Knowledge Graphs [11.191853171170516]
We propose an agent system that integrates large language models (LLMs) and knowledge graphs.<n>Our system fully automates the retrieval of relevant literatures, extraction of reaction data, database querying, construction of retrosynthetic pathway trees.<n>This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs.
arXiv Detail & Related papers (2025-01-15T16:06:10Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - Retro-prob: Retrosynthetic Planning Based on a Probabilistic Model [5.044138778500218]
Retrosynthesis is a fundamental but challenging task in organic chemistry.
Given a target molecule, the goal of retrosynthesis is to find out a series of reactions which could be assembled into a synthetic route.
We propose a new retrosynthetic planning algorithm called retro-prob to maximize the successful synthesis probability of target molecules.
arXiv Detail & Related papers (2024-05-25T08:23:40Z) - Mind the Retrosynthesis Gap: Bridging the divide between Single-step and
Multi-step Retrosynthesis Prediction [0.9134244356393664]
Multi-step approaches repeatedly apply the chemical information stored in single-step retrosynthesis models.
We show that models designed for single-step retrosynthesis, when extended to multi-step, can have a tremendous impact on the route finding capabilities of current multi-step methods.
arXiv Detail & Related papers (2022-12-12T18:06:24Z) - FusionRetro: Molecule Representation Fusion via In-Context Learning for
Retrosynthetic Planning [58.47265392465442]
Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule.
Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms.
We propose a novel framework that utilizes context information for improved retrosynthetic planning.
arXiv Detail & Related papers (2022-09-30T08:44:58Z) - Retrosynthetic Planning with Experience-Guided Monte Carlo Tree Search [10.67810457039541]
In retrosynthetic planning, the huge number of possible routes to synthesize a complex molecule leads to an explosion of possibilities.
Current approaches rely on human-defined or machine-trained score functions which have limited chemical knowledge.
We build an experience guidance network to learn knowledge from synthetic experiences during the search.
arXiv Detail & Related papers (2021-12-11T17:14:15Z) - Self-Improved Retrosynthetic Planning [66.5397931294144]
Retrosynthetic planning is a fundamental problem in chemistry for finding a pathway of reactions to synthesize a target molecule.
Recent search algorithms have shown promising results for solving this problem by using deep neural networks (DNNs)
We propose an end-to-end framework for directly training the DNNs towards generating reaction pathways with the desirable properties.
arXiv Detail & Related papers (2021-06-09T08:03:57Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.