G2GT: Retrosynthesis Prediction with Graph to Graph Attention Neural
Network and Self-Training
- URL: http://arxiv.org/abs/2204.08608v1
- Date: Tue, 19 Apr 2022 01:55:52 GMT
- Title: G2GT: Retrosynthesis Prediction with Graph to Graph Attention Neural
Network and Self-Training
- Authors: Zaiyun Lin (Beijing Stonewise Technology) and Shiqiu Yin (Beijing
Stonewise Technology) and Lei Shi (Beijing Stonewise Technology) and Wenbiao
Zhou (Beijing Stonewise Technology) and YingSheng Zhang (Beijing Stonewise
Technology)
- Abstract summary: Retrosynthesis prediction is one of the fundamental challenges in organic chemistry and related fields.
We propose a new graph-to-graph transformation model, G2GT, in which the graph encoder and graph decoder are built upon the standard transformer structure.
We show that self-training, a powerful data augmentation method, can significantly improve the model's performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrosynthesis prediction is one of the fundamental challenges in organic
chemistry and related fields. The goal is to find reactants molecules that can
synthesize product molecules. To solve this task, we propose a new
graph-to-graph transformation model, G2GT, in which the graph encoder and graph
decoder are built upon the standard transformer structure. We also show that
self-training, a powerful data augmentation method that utilizes unlabeled
molecule data, can significantly improve the model's performance. Inspired by
the reaction type label and ensemble learning, we proposed a novel weak
ensemble method to enhance diversity. We combined beam search, nucleus, and
top-k sampling methods to further improve inference diversity and proposed a
simple ranking algorithm to retrieve the final top-10 results. We achieved new
state-of-the-art results on both the USPTO-50K dataset, with top1 accuracy of
54%, and the larger data set USPTO-full, with top1 accuracy of 50%, and
competitive top-10 results.
Related papers
- GSTAM: Efficient Graph Distillation with Structural Attention-Matching [13.673737442696154]
We introduce Graph Distillation with Structural Attention Matching ( GSTAM), a novel method for condensing graph classification datasets.
GSTAM leverages the attention maps of GNNs to distill structural information from the original dataset into synthetic graphs.
Comprehensive experiments demonstrate GSTAM's superiority over existing methods, achieving 0.45% to 6.5% better performance in extreme condensation ratios.
arXiv Detail & Related papers (2024-08-29T19:40:04Z) - MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Permutation Equivariant Graph Framelets for Heterophilous Graph Learning [6.679929638714752]
We develop a new way to implement multi-scale extraction via constructing Haar-type graph framelets.
We show that our model can achieve the best performance on certain datasets of heterophilous graphs.
arXiv Detail & Related papers (2023-06-07T09:05:56Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Permutation invariant graph-to-sequence model for template-free
retrosynthesis and reaction prediction [2.5655440962401617]
We describe a novel Graph2SMILES model that combines the power of Transformer models for text generation with the permutation invariance of molecular graph encoders.
As an end-to-end architecture, Graph2SMILES can be used as a drop-in replacement for the Transformer in any task involving molecule(s)-to-molecule(s) transformations.
arXiv Detail & Related papers (2021-10-19T01:23:15Z) - Robust Optimization as Data Augmentation for Large-scale Graphs [117.2376815614148]
We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training.
FLAG is a general-purpose approach for graph data, which universally works in node classification, link prediction, and graph classification tasks.
arXiv Detail & Related papers (2020-10-19T21:51:47Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z) - Uncovering the Folding Landscape of RNA Secondary Structure with Deep
Graph Embeddings [71.20283285671461]
We propose a geometric scattering autoencoder (GSAE) network for learning such graph embeddings.
Our embedding network first extracts rich graph features using the recently proposed geometric scattering transform.
We show that GSAE organizes RNA graphs both by structure and energy, accurately reflecting bistable RNA structures.
arXiv Detail & Related papers (2020-06-12T00:17:59Z) - Graph-Aware Transformer: Is Attention All Graphs Need? [5.240000443825077]
GRaph-Aware Transformer (GRAT) is first Transformer-based model which can encode and decode whole graphs in end-to-end fashion.
GRAT has shown very promising results including state-of-the-art performance on 4 regression tasks in QM9 benchmark.
arXiv Detail & Related papers (2020-06-09T12:13:56Z) - A Graph to Graphs Framework for Retrosynthesis Prediction [42.99048270311063]
A fundamental problem in computational chemistry is to find a set of reactants to synthesize a target molecule.
We propose a novel template-free approach called G2Gs by transforming a target molecular graph into a set of reactant molecular graphs.
G2Gs significantly outperforms existing template-free approaches by up to 63% in terms of the top-1 accuracy.
arXiv Detail & Related papers (2020-03-28T06:16:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.