Synthesizable by Design: A Retrosynthesis-Guided Framework for Molecular Analog Generation
- URL: http://arxiv.org/abs/2507.02752v1
- Date: Thu, 03 Jul 2025 16:14:57 GMT
- Title: Synthesizable by Design: A Retrosynthesis-Guided Framework for Molecular Analog Generation
- Authors: Shuan Chen, Gunwook Nam, Yousung Jung,
- Abstract summary: We introduce SynTwins, a novel retrosynthesis-guided molecular analog design framework.<n>In comparative evaluations, SynTwins demonstrates superior performance in generating synthetically accessible analogs.<n>Our benchmarking across diverse molecular datasets demonstrates that SynTwins effectively bridges the gap between computational design and experimental synthesis.
- Score: 0.5852077003870417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The disconnect between AI-generated molecules with desirable properties and their synthetic feasibility remains a critical bottleneck in computational drug and material discovery. While generative AI has accelerated the proposal of candidate molecules, many of these structures prove challenging or impossible to synthesize using established chemical reactions. Here, we introduce SynTwins, a novel retrosynthesis-guided molecular analog design framework that designs synthetically accessible molecular analogs by emulating expert chemist strategies through a three-step process: retrosynthesis, similar building block searching, and virtual synthesis. In comparative evaluations, SynTwins demonstrates superior performance in generating synthetically accessible analogs compared to state-of-the-art machine learning models while maintaining high structural similarity to original target molecules. Furthermore, when integrated with existing molecule optimization frameworks, our hybrid approach produces synthetically feasible molecules with property profiles comparable to unconstrained molecule generators, yet its synthesizability ensured. Our comprehensive benchmarking across diverse molecular datasets demonstrates that SynTwins effectively bridges the gap between computational design and experimental synthesis, providing a practical solution for accelerating the discovery of synthesizable molecules with desired properties for a wide range of applications.
Related papers
- SynCoGen: Synthesizable 3D Molecule Generation via Joint Reaction and Coordinate Modeling [29.856853267388924]
We present SynCoGen, a framework that combines masked graph diffusion and flow matching for synthesizable 3D molecule generation.<n>To train the model, we curated SynSpace, a dataset containing over 600K-aware building block graphs and 3.3M conformers.
arXiv Detail & Related papers (2025-07-16T00:36:35Z) - SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models [2.4120602995529317]
We present a novel approach by fine-tuning Meta's Llama3 Large Language Models to create SynLlama.<n> SynLlama generates full synthetic pathways made of commonly accessible building blocks and robust organic reaction templates.<n>We find that SynLlama, even without training on external building blocks, can effectively generalize to unseen yet purchasable building blocks.
arXiv Detail & Related papers (2025-03-16T18:30:56Z) - Generative Artificial Intelligence for Navigating Synthesizable Chemical Space [25.65907958071386]
We introduce SynFormer, a generative modeling framework designed to efficiently explore and navigate synthesizable chemical space.
By incorporating a scalable transformer architecture and a diffusion module for building block selection, SynFormer surpasses existing models in synthesizable molecular design.
arXiv Detail & Related papers (2024-10-04T15:09:05Z) - SynthFormer: Equivariant Pharmacophore-based Generation of Synthesizable Molecules for Ligand-Based Drug Design [19.578382119811238]
We introduce SynthFormer, a novel machine learning model that generates fully synthesizable molecules, structured as synthetic trees, by introducing both 3D information and pharmacophores as input.<n>It is a first-of-its-kind approach that could provide capabilities for designing active molecules based on pharmacophores.
arXiv Detail & Related papers (2024-10-03T17:38:46Z) - Procedural Synthesis of Synthesizable Molecules [22.905205379063148]
Design of synthetically accessible molecules and recommending analogs to unsynthesizable molecules are important problems for accelerating molecular discovery.<n>We reconceptualize both problems using ideas from program synthesis.<n>We create a bilevel framework for reasoning about the space of synthesis pathways.
arXiv Detail & Related papers (2024-08-24T04:32:36Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - RGFN: Synthesizable Molecular Generation Using GFlowNets [51.33672611338754]
We propose Reaction-GFlowNet, an extension of the GFlowNet framework that operates directly in the space of chemical reactions.
RGFN allows out-of-the-box synthesizability while maintaining comparable quality of generated candidates.
We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
arXiv Detail & Related papers (2024-06-01T13:11:11Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Learning Graph Models for Retrosynthesis Prediction [90.15523831087269]
Retrosynthesis prediction is a fundamental problem in organic synthesis.
This paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction.
Our model achieves a top-1 accuracy of $53.7%$, outperforming previous template-free and semi-template-based methods.
arXiv Detail & Related papers (2020-06-12T09:40:42Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.