Assessing the Extrapolation Capability of Template-Free Retrosynthesis
Models
- URL: http://arxiv.org/abs/2403.03960v1
- Date: Thu, 29 Feb 2024 00:48:17 GMT
- Title: Assessing the Extrapolation Capability of Template-Free Retrosynthesis
Models
- Authors: Shuan Chen and Yousung Jung
- Abstract summary: We empirically assess the extrapolation capability of state-of-the-art template-free models by meticulously assembling an extensive set of out-of-distribution (OOD) reactions.
Our findings demonstrate that while template-free models exhibit potential in predicting synthesis with novel rules, their top-10 exact-match accuracy in OOD reactions is strikingly modest.
- Score: 0.7770029179741429
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the acknowledged capability of template-free models in exploring
unseen reaction spaces compared to template-based models for retrosynthesis
prediction, their ability to venture beyond established boundaries remains
relatively uncharted. In this study, we empirically assess the extrapolation
capability of state-of-the-art template-free models by meticulously assembling
an extensive set of out-of-distribution (OOD) reactions. Our findings
demonstrate that while template-free models exhibit potential in predicting
precursors with novel synthesis rules, their top-10 exact-match accuracy in OOD
reactions is strikingly modest (< 1%). Furthermore, despite the capability of
generating novel reactions, our investigation highlights a recurring issue
where more than half of the novel reactions predicted by template-free models
are chemically implausible. Consequently, we advocate for the future
development of template-free models that integrate considerations of chemical
feasibility when navigating unexplored regions of reaction space.
Related papers
- RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets [8.308430428140413]
Single-step retrosynthesis aims to predict a set of reactions that lead to the creation of a target molecule.
We propose a novel model, RetroGFN, that can explore outside the limited dataset and return a diverse set of feasible reactions.
We show that RetroGFN achieves competitive results on standard top-k accuracy while outperforming existing methods on round-trip accuracy.
arXiv Detail & Related papers (2024-06-26T20:10:03Z) - UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment [51.49238426241974]
This paper introduces UAlign, a template-free graph-to-sequence pipeline for retrosynthesis prediction.
By combining graph neural networks and Transformers, our method can more effectively leverage the inherent graph structure of molecules.
arXiv Detail & Related papers (2024-03-25T03:23:03Z) - Beyond Major Product Prediction: Reproducing Reaction Mechanisms with
Machine Learning Models Trained on a Large-Scale Mechanistic Dataset [10.968137261042715]
Mechanistic understanding of organic reactions can facilitate reaction development, impurity prediction, and in principle, reaction discovery.
While several machine learning models have sought to address the task of predicting reaction products, their extension to predicting reaction mechanisms has been impeded by the lack of a corresponding mechanistic dataset.
We construct such a dataset by imputing intermediates between experimentally reported reactants and products using expert reaction templates and train several machine learning models on the resulting dataset of 5,184,184 elementary steps.
arXiv Detail & Related papers (2024-03-07T15:26:23Z) - Feedback Efficient Online Fine-Tuning of Diffusion Models [52.170384048274364]
We propose a novel reinforcement learning procedure that efficiently explores on the manifold of feasible samples.
We present a theoretical analysis providing a regret guarantee, as well as empirical validation across three domains.
arXiv Detail & Related papers (2024-02-26T07:24:32Z) - Molecule-Edit Templates for Efficient and Accurate Retrosynthesis
Prediction [0.16070833439280313]
We introduce METRO, a machine-learning model that predicts reactions using minimal templates.
We achieve state-of-the-art results on standard benchmarks.
arXiv Detail & Related papers (2023-10-11T09:00:02Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - RetroComposer: Discovering Novel Reactions by Composing Templates for
Retrosynthesis Prediction [63.14937611038264]
We propose an innovative retrosynthesis prediction framework that can compose novel templates beyond training templates.
Experimental results show that our method can produce novel templates for 328 test reactions in the USPTO-50K dataset.
arXiv Detail & Related papers (2021-12-20T05:57:07Z) - Robustness of Model Predictions under Extension [3.766702945560518]
A caveat to using models for analysis is that predicted causal effects and conditional independences may not be robust under model extensions.
We show how to use the technique of causal ordering to efficiently assess the robustness of qualitative model predictions.
For dynamical systems at equilibrium, we demonstrate how novel insights help to select appropriate model extensions.
arXiv Detail & Related papers (2020-12-08T20:21:03Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Learning Graph Models for Retrosynthesis Prediction [90.15523831087269]
Retrosynthesis prediction is a fundamental problem in organic synthesis.
This paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction.
Our model achieves a top-1 accuracy of $53.7%$, outperforming previous template-free and semi-template-based methods.
arXiv Detail & Related papers (2020-06-12T09:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.