Hierarchical Generation of Molecular Graphs using Structural Motifs
- URL: http://arxiv.org/abs/2002.03230v2
- Date: Sat, 18 Apr 2020 15:14:46 GMT
- Title: Hierarchical Generation of Molecular Graphs using Structural Motifs
- Authors: Wengong Jin, Regina Barzilay, Tommi Jaakkola
- Abstract summary: We propose a new hierarchical graph encoder-decoder that employs significantly larger and more flexible graph motifs as basic building blocks.
Our encoder produces a multi-resolution representation for each molecule in a fine-to-coarse fashion, from atoms to connected motifs.
We evaluate our model on multiple molecule generation tasks, including polymers, and show that our model significantly outperforms previous state-of-the-art baselines.
- Score: 38.637412590671865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph generation techniques are increasingly being adopted for drug
discovery. Previous graph generation approaches have utilized relatively small
molecular building blocks such as atoms or simple cycles, limiting their
effectiveness to smaller molecules. Indeed, as we demonstrate, their
performance degrades significantly for larger molecules. In this paper, we
propose a new hierarchical graph encoder-decoder that employs significantly
larger and more flexible graph motifs as basic building blocks. Our encoder
produces a multi-resolution representation for each molecule in a
fine-to-coarse fashion, from atoms to connected motifs. Each level integrates
the encoding of constituents below with the graph at that level. Our
autoregressive coarse-to-fine decoder adds one motif at a time, interleaving
the decision of selecting a new motif with the process of resolving its
attachments to the emerging molecule. We evaluate our model on multiple
molecule generation tasks, including polymers, and show that our model
significantly outperforms previous state-of-the-art baselines.
Related papers
- GraphXForm: Graph transformer for computer-aided molecular design with application to extraction [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned.
We evaluate it on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - MolHF: A Hierarchical Normalizing Flow for Molecular Graph Generation [4.517805235253331]
MolHF is a new hierarchical flow-based model that generates molecular graphs in a coarse-to-fine manner.
MolHF is the first flow-based model that can be applied to model larger molecules (polymer) with more than 100 heavy atoms.
arXiv Detail & Related papers (2023-05-15T08:59:35Z) - Molecular Graph Representation Learning via Heterogeneous Motif Graph
Construction [19.64574177805823]
We propose a novel molecular graph representation learning method by constructing a heterogeneous motif graph.
In particular, we build a heterogeneous motif graph that contains motif nodes and molecular nodes.
We show that our model achieves similar performances with significantly less computational resources by using our edge sampler.
arXiv Detail & Related papers (2022-02-01T16:21:01Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - GraphPiece: Efficiently Generating High-Quality Molecular Graph with
Substructures [7.021635649909492]
We propose a method to automatically discover common substructures, which we call em graph pieces, from given molecular graphs.
Based on graph pieces, we leverage a variational autoencoder to generate molecules in two phases: piece-level graph generation followed by bond completion.
arXiv Detail & Related papers (2021-06-29T05:26:18Z) - Molecular graph generation with Graph Neural Networks [2.7393821783237184]
We introduce a sequential molecular graph generator based on a set of graph neural network modules, which we call MG2N2.
Our model is capable of generalizing molecular patterns seen during the training phase, without overfitting.
arXiv Detail & Related papers (2020-12-14T10:32:57Z) - Conditional Constrained Graph Variational Autoencoders for Molecule
Design [70.59828655929194]
We present Conditional Constrained Graph Variational Autoencoder (CCGVAE), a model that implements this key-idea in a state-of-the-art model.
We show improved results on several evaluation metrics on two commonly adopted datasets for molecule generation.
arXiv Detail & Related papers (2020-09-01T21:58:07Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.