Root-aligned SMILES for Molecular Retrosynthesis Prediction
- URL: http://arxiv.org/abs/2203.11444v2
- Date: Wed, 23 Mar 2022 04:35:56 GMT
- Title: Root-aligned SMILES for Molecular Retrosynthesis Prediction
- Authors: Zipeng Zhong, Jie Song, Zunlei Feng, Tiantao Liu, Lingxiang Jia,
Shaolun Yao, Min Wu, Tingjun Hou and Mingli Song
- Abstract summary: Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to discover precursor molecules that can be used to synthesize a target molecule.
A popular paradigm of existing computational retrosynthesis methods formulate retrosynthesis prediction as a sequence-to-sequence translation problem.
We propose the root-aligned SMILES(R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES.
- Score: 31.818364437526885
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrosynthesis prediction is a fundamental problem in organic synthesis,
where the task is to discover precursor molecules that can be used to
synthesize a target molecule. A popular paradigm of existing computational
retrosynthesis methods formulate retrosynthesis prediction as a
sequence-to-sequence translation problem, where the typical SMILES
representations are adopted for both reactants and products. However, the
general-purpose SMILES neglects the characteristics of retrosynthesis that 1)
the search space of the reactants is quite huge, and 2) the molecular graph
topology is largely unaltered from products to reactants, resulting in the
suboptimal performance of SMILES if straightforwardly applied. In this article,
we propose the root-aligned SMILES~(R-SMILES), which specifies a tightly
aligned one-to-one mapping between the product and the reactant SMILES, to
narrow the string representation discrepancy for more efficient retrosynthesis.
As the minimum edit distance between the input and the output is significantly
decreased with the proposed R-SMILES, the computational model is largely
relieved from learning the complex syntax and dedicated to learning the
chemical knowledge for retrosynthesis. We compare the proposed R-SMILES with
various state-of-the-art baselines on different benchmarks and show that it
significantly outperforms them all, demonstrating the superiority of the
proposed method.
Related papers
- BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - Retrosynthesis prediction enhanced by in-silico reaction data
augmentation [66.5643280109899]
We present RetroWISE, a framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation.
On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models.
arXiv Detail & Related papers (2024-01-31T07:40:37Z) - MARS: A Motif-based Autoregressive Model for Retrosynthesis Prediction [54.75583184356392]
We propose a novel end-to-end graph generation model for retrosynthesis prediction.
It sequentially identifies the reaction center, generates the synthons, and adds motifs to the synthons to generate reactants.
Experiments on a benchmark dataset show that the proposed model significantly outperforms previous state-of-the-art algorithms.
arXiv Detail & Related papers (2022-09-27T06:29:35Z) - Modeling Diverse Chemical Reactions for Single-step Retrosynthesis via
Discrete Latent Variables [43.900173434781905]
The goal of single-step retrosynthesis is to identify the possible reactants that lead to the synthesis of the target product in one reaction.
Existing sequence-based retrosynthetic methods treat the product-to-reactant retrosynthesis as a sequence-to-sequence translation problem.
We propose RetroDVCAE, which incorporates conditional variational autoencoders into single-step retrosynthesis and associates discrete latent variables with the generation process.
arXiv Detail & Related papers (2022-08-10T14:50:32Z) - Retroformer: Pushing the Limits of Interpretable End-to-end
Retrosynthesis Transformer [15.722719721123054]
Retrosynthesis prediction is one of the fundamental challenges in organic synthesis.
We propose Retroformer, a novel Transformer-based architecture for retrosynthesis prediction.
Retroformer reaches the new state-of-the-art accuracy for the end-to-end template-free retrosynthesis.
arXiv Detail & Related papers (2022-01-29T02:03:55Z) - Permutation invariant graph-to-sequence model for template-free
retrosynthesis and reaction prediction [2.5655440962401617]
We describe a novel Graph2SMILES model that combines the power of Transformer models for text generation with the permutation invariance of molecular graph encoders.
As an end-to-end architecture, Graph2SMILES can be used as a drop-in replacement for the Transformer in any task involving molecule(s)-to-molecule(s) transformations.
arXiv Detail & Related papers (2021-10-19T01:23:15Z) - RetCL: A Selection-based Approach for Retrosynthesis via Contrastive
Learning [107.64562550844146]
Retrosynthesis is an emerging research area of deep learning.
We propose a new approach that reformulating retrosynthesis into a selection problem of reactants from a candidate set of commercially available molecules.
For learning the score functions, we also propose a novel contrastive training scheme with hard negative mining.
arXiv Detail & Related papers (2021-05-03T12:47:57Z) - RetroXpert: Decompose Retrosynthesis Prediction like a Chemist [60.463900712314754]
We devise a novel template-free algorithm for automatic retrosynthetic expansion.
Our method disassembles retrosynthesis into two steps.
While outperforming the state-of-the-art baselines, our model also provides chemically reasonable interpretation.
arXiv Detail & Related papers (2020-11-04T04:35:34Z) - Retrosynthesis Prediction with Conditional Graph Logic Network [118.70437805407728]
Computer-aided retrosynthesis is finding renewed interest from both chemistry and computer science communities.
We propose a new approach to this task using the Conditional Graph Logic Network, a conditional graphical model built upon graph neural networks.
arXiv Detail & Related papers (2020-01-06T05:36:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.