MoFlow: An Invertible Flow Model for Generating Molecular Graphs
- URL: http://arxiv.org/abs/2006.10137v1
- Date: Wed, 17 Jun 2020 20:14:19 GMT
- Title: MoFlow: An Invertible Flow Model for Generating Molecular Graphs
- Authors: Chengxi Zang and Fei Wang
- Abstract summary: MoFlow is a flow-based graph generative model to learn invertible mappings between molecular graphs and latent representations.
Our model has merits including exact and tractable likelihood training, efficient one-pass embedding and generation, chemical validity guarantees, 100% reconstruction of training data, and good generalization ability.
- Score: 19.829612234339578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating molecular graphs with desired chemical properties driven by deep
graph generative models provides a very promising way to accelerate drug
discovery process. Such graph generative models usually consist of two steps:
learning latent representations and generation of molecular graphs. However, to
generate novel and chemically-valid molecular graphs from latent
representations is very challenging because of the chemical constraints and
combinatorial complexity of molecular graphs. In this paper, we propose MoFlow,
a flow-based graph generative model to learn invertible mappings between
molecular graphs and their latent representations. To generate molecular
graphs, our MoFlow first generates bonds (edges) through a Glow based model,
then generates atoms (nodes) given bonds by a novel graph conditional flow, and
finally assembles them into a chemically valid molecular graph with a posthoc
validity correction. Our MoFlow has merits including exact and tractable
likelihood training, efficient one-pass embedding and generation, chemical
validity guarantees, 100\% reconstruction of training data, and good
generalization ability. We validate our model by four tasks: molecular graph
generation and reconstruction, visualization of the continuous latent space,
property optimization, and constrained property optimization. Our MoFlow
achieves state-of-the-art performance, which implies its potential efficiency
and effectiveness to explore large chemical space for drug discovery.
Related papers
- Improving Molecular Graph Generation with Flow Matching and Optimal Transport [8.2504828891983]
GGFlow is a discrete flow matching generative model incorporating optimal transport for molecular graphs.
It incorporates an edge-augmented graph transformer to enable the direct communications among chemical bounds.
GGFlow demonstrates superior performance on both unconditional and conditional molecule generation tasks.
arXiv Detail & Related papers (2024-11-08T16:27:27Z) - GraphXForm: Graph transformer for computer-aided molecular design with application to extraction [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned.
We evaluate it on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Graph Generation with Diffusion Mixture [57.78958552860948]
Generation of graphs is a major challenge for real-world tasks that require understanding the complex nature of their non-Euclidean structures.
We propose a generative framework that models the topology of graphs by explicitly learning the final graph structures of the diffusion process.
arXiv Detail & Related papers (2023-02-07T17:07:46Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - GraphPiece: Efficiently Generating High-Quality Molecular Graph with
Substructures [7.021635649909492]
We propose a method to automatically discover common substructures, which we call em graph pieces, from given molecular graphs.
Based on graph pieces, we leverage a variational autoencoder to generate molecules in two phases: piece-level graph generation followed by bond completion.
arXiv Detail & Related papers (2021-06-29T05:26:18Z) - MolCLR: Molecular Contrastive Learning of Representations via Graph
Neural Networks [11.994553575596228]
MolCLR is a self-supervised learning framework for large unlabeled molecule datasets.
We propose three novel molecule graph augmentations: atom masking, bond deletion, and subgraph removal.
Our method achieves state-of-the-art performance on many challenging datasets.
arXiv Detail & Related papers (2021-02-19T17:35:18Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z) - GraphAF: a Flow-based Autoregressive Model for Molecular Graph
Generation [45.360695120154]
We propose a flow-based autoregressive model for graph generation called GraphAF.
GraphAF combines the advantages of both autoregressive and flow-based approaches and enjoys: (1) high model flexibility for data density estimation; (2) efficient parallel computation for training; (3) an iterative sampling process, which allows leveraging chemical domain knowledge for valency checking.
arXiv Detail & Related papers (2020-01-26T01:12:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.