RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis
- URL: http://arxiv.org/abs/2506.04439v1
- Date: Wed, 04 Jun 2025 20:46:05 GMT
- Title: RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis
- Authors: Robin Yadav, Qi Yan, Guy Wolf, Avishek Joey Bose, Renjie Liao,
- Abstract summary: We model single-step retrosynthesis planning and introduce RETRO SYNFLOW (RSF) a discrete flow-matching framework.<n>We employ Feynman-Kac steering with Sequential Monte Carlo based resampling to steer promising generations at inference.
- Score: 23.422202032748924
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A fundamental problem in organic chemistry is identifying and predicting the series of reactions that synthesize a desired target product molecule. Due to the combinatorial nature of the chemical search space, single-step reactant prediction -- i.e. single-step retrosynthesis -- remains challenging even for existing state-of-the-art template-free generative approaches to produce an accurate yet diverse set of feasible reactions. In this paper, we model single-step retrosynthesis planning and introduce RETRO SYNFLOW (RSF) a discrete flow-matching framework that builds a Markov bridge between the prescribed target product molecule and the reactant molecule. In contrast to past approaches, RSF employs a reaction center identification step to produce intermediate structures known as synthons as a more informative source distribution for the discrete flow. To further enhance diversity and feasibility of generated samples, we employ Feynman-Kac steering with Sequential Monte Carlo based resampling to steer promising generations at inference using a new reward oracle that relies on a forward-synthesis model. Empirically, we demonstrate \nameshort achieves $60.0 \%$ top-1 accuracy, which outperforms the previous SOTA by $20 \%$. We also substantiate the benefits of steering at inference and demonstrate that FK-steering improves top-$5$ round-trip accuracy by $19 \%$ over prior template-free SOTA methods, all while preserving competitive top-$k$ accuracy results.
Related papers
- DiffER: Categorical Diffusion for Chemical Retrosynthesis [4.8757706070066265]
We propose DiffER, an alternative template-free method for retrosynthesis prediction in the form of categorical diffusion.<n>We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy.
arXiv Detail & Related papers (2025-05-29T17:53:37Z) - Chimera: Accurate retrosynthesis prediction by ensembling models with diverse inductive biases [3.885174353072695]
Planning and conducting chemical syntheses remains a major bottleneck in the discovery of functional small molecules.<n>Inspired by how chemists use different strategies to ideate reactions, we propose Chimera: a framework for building highly accurate reaction models.
arXiv Detail & Related papers (2024-12-06T18:55:19Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets [8.308430428140413]
Single-step retrosynthesis aims to predict a set of reactions that lead to the creation of a target molecule.<n>We propose a novel model, RetroGFN, that can explore outside the limited dataset and return a diverse set of feasible reactions.<n>We show that RetroGFN achieves competitive results on standard top-k accuracy while outperforming existing methods on round-trip accuracy.
arXiv Detail & Related papers (2024-06-26T20:10:03Z) - UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment [51.49238426241974]
This paper introduces UAlign, a template-free graph-to-sequence pipeline for retrosynthesis prediction.
By combining graph neural networks and Transformers, our method can more effectively leverage the inherent graph structure of molecules.
arXiv Detail & Related papers (2024-03-25T03:23:03Z) - RetroBridge: Modeling Retrosynthesis with Markov Bridges [2.256703675017117]
Retrosynthesis planning aims at designing reaction pathways from commercially available starting materials to a target molecule.
We introduce the Markov Bridge Model, a generative framework aimed to approximate the dependency between two discrete distributions.
We then address the retrosynthesis planning problem with our novel framework and introduce RetroBridge, a template-free retrosynthesis modeling approach.
arXiv Detail & Related papers (2023-08-30T15:09:22Z) - MARS: A Motif-based Autoregressive Model for Retrosynthesis Prediction [54.75583184356392]
We propose a novel end-to-end graph generation model for retrosynthesis prediction.
It sequentially identifies the reaction center, generates the synthons, and adds motifs to the synthons to generate reactants.
Experiments on a benchmark dataset show that the proposed model significantly outperforms previous state-of-the-art algorithms.
arXiv Detail & Related papers (2022-09-27T06:29:35Z) - Root-aligned SMILES for Molecular Retrosynthesis Prediction [31.818364437526885]
Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to discover precursor molecules that can be used to synthesize a target molecule.
A popular paradigm of existing computational retrosynthesis methods formulate retrosynthesis prediction as a sequence-to-sequence translation problem.
We propose the root-aligned SMILES(R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES.
arXiv Detail & Related papers (2022-03-22T03:50:04Z) - RetCL: A Selection-based Approach for Retrosynthesis via Contrastive
Learning [107.64562550844146]
Retrosynthesis is an emerging research area of deep learning.
We propose a new approach that reformulating retrosynthesis into a selection problem of reactants from a candidate set of commercially available molecules.
For learning the score functions, we also propose a novel contrastive training scheme with hard negative mining.
arXiv Detail & Related papers (2021-05-03T12:47:57Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Retrosynthesis Prediction with Conditional Graph Logic Network [118.70437805407728]
Computer-aided retrosynthesis is finding renewed interest from both chemistry and computer science communities.
We propose a new approach to this task using the Conditional Graph Logic Network, a conditional graphical model built upon graph neural networks.
arXiv Detail & Related papers (2020-01-06T05:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.