DiffER: Categorical Diffusion for Chemical Retrosynthesis
- URL: http://arxiv.org/abs/2505.23721v2
- Date: Tue, 03 Jun 2025 04:40:52 GMT
- Title: DiffER: Categorical Diffusion for Chemical Retrosynthesis
- Authors: Sean Current, Ziqi Chen, Daniel Adu-Ampratwum, Xia Ning, Srinivasan Parthasarathy,
- Abstract summary: We propose DiffER, an alternative template-free method for retrosynthesis prediction in the form of categorical diffusion.<n>We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy.
- Score: 4.8757706070066265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Methods for automatic chemical retrosynthesis have found recent success through the application of models traditionally built for natural language processing, primarily through transformer neural networks. These models have demonstrated significant ability to translate between the SMILES encodings of chemical products and reactants, but are constrained as a result of their autoregressive nature. We propose DiffER, an alternative template-free method for retrosynthesis prediction in the form of categorical diffusion, which allows the entire output SMILES sequence to be predicted in unison. We construct an ensemble of diffusion models which achieves state-of-the-art performance for top-1 accuracy and competitive performance for top-3, top-5, and top-10 accuracy among template-free methods. We prove that DiffER is a strong baseline for a new class of template-free model, capable of learning a variety of synthetic techniques used in laboratory settings and outperforming a variety of other template-free methods on top-k accuracy metrics. By constructing an ensemble of categorical diffusion models with a novel length prediction component with variance, our method is able to approximately sample from the posterior distribution of reactants, producing results with strong metrics of confidence and likelihood. Furthermore, our analyses demonstrate that accurate prediction of the SMILES sequence length is key to further boosting the performance of categorical diffusion models.
Related papers
- Amortized In-Context Mixed Effect Transformer Models: A Zero-Shot Approach for Pharmacokinetics [0.0]
We present the Amortized In-Context Mixed-Effect Transformer (AICMET) model.<n>It unifies mechanistic compartmental priors with amortized in-context Bayesian inference.<n>Experiments show that AICMET attains state-of-the-art predictive accuracy and faithfully quantifies inter-patient variability.
arXiv Detail & Related papers (2025-08-21T15:45:17Z) - S$^2$-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models [26.255679321570014]
S2-Guidance is a novel method that leverages block-dropping during the forward process to construct sub-networks.<n>Experiments on text-to-image and text-to-video generation tasks demonstrate that S2-Guidance delivers superior performance.
arXiv Detail & Related papers (2025-08-18T12:31:20Z) - Flow matching for reaction pathway generation [1.8420084274819617]
MolGEN is a conditional flow-matching framework that learns an optimal transport path to transport Gaussian priors to target chemical distributions.<n>On benchmarks used by TSDiff and OA-ReactDiff, MolGEN surpasses TS geometry accuracy and barrier-height prediction while reducing sampling to sub-second.<n>MolGEN also supports open-ended product generation with competitive top-k accuracy and avoids mass/electron-balance violations common to sequence models.
arXiv Detail & Related papers (2025-07-14T17:54:47Z) - Divergence Minimization Preference Optimization for Diffusion Model Alignment [66.31417479052774]
Divergence Minimization Preference Optimization (DMPO) is a principled method for aligning diffusion models by minimizing reverse KL divergence.<n>DMPO can consistently outperform or match existing techniques across different base models and test sets.
arXiv Detail & Related papers (2025-07-10T07:57:30Z) - RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis [23.422202032748924]
We model single-step retrosynthesis planning and introduce RETRO SYNFLOW (RSF) a discrete flow-matching framework.<n>We employ Feynman-Kac steering with Sequential Monte Carlo based resampling to steer promising generations at inference.
arXiv Detail & Related papers (2025-06-04T20:46:05Z) - Self-Refining Training for Amortized Density Functional Theory [5.5541132320126945]
We propose a novel method that reduces the dependency of amortized DFT solvers on large pre-collected datasets by introducing a self-refining training strategy.<n>We derive our method as a minimization of the variational upper bound on the KL-divergence measuring the discrepancy between the generated samples and the target Boltzmann distribution defined by the ground state energy.
arXiv Detail & Related papers (2025-06-02T00:32:32Z) - Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts [64.34482582690927]
We provide an efficient and principled method for sampling from a sequence of annealed, geometric-averaged, or product distributions derived from pretrained score-based models.<n>We propose Sequential Monte Carlo (SMC) resampling algorithms that leverage inference-time scaling to improve sampling quality.
arXiv Detail & Related papers (2025-03-04T17:46:51Z) - Chimera: Accurate retrosynthesis prediction by ensembling models with diverse inductive biases [3.885174353072695]
Planning and conducting chemical syntheses remains a major bottleneck in the discovery of functional small molecules.<n>Inspired by how chemists use different strategies to ideate reactions, we propose Chimera: a framework for building highly accurate reaction models.
arXiv Detail & Related papers (2024-12-06T18:55:19Z) - Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.<n>Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment [51.49238426241974]
This paper introduces UAlign, a template-free graph-to-sequence pipeline for retrosynthesis prediction.
By combining graph neural networks and Transformers, our method can more effectively leverage the inherent graph structure of molecules.
arXiv Detail & Related papers (2024-03-25T03:23:03Z) - Gluformer: Transformer-Based Personalized Glucose Forecasting with
Uncertainty Quantification [7.451722745955049]
We propose to model the future glucose trajectory conditioned on the past as an infinite mixture of basis distributions.
This change allows us to learn the uncertainty and predict more accurately in the cases when the trajectory has a heterogeneous or multi-modal distribution.
We empirically demonstrate the superiority of our method over existing state-of-the-art techniques both in terms of accuracy and uncertainty on the synthetic and benchmark glucose data sets.
arXiv Detail & Related papers (2022-09-09T21:03:43Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis [1.6449390849183363]
Retrosynthesis is a problem to infer reactant compounds to synthesize a given product compound through chemical reactions.
Recent studies on retrosynthesis focus on proposing more sophisticated prediction models.
The dataset to feed the models also plays an essential role in achieving the best generalizing models.
arXiv Detail & Related papers (2020-10-02T05:27:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.