GraphAF: a Flow-based Autoregressive Model for Molecular Graph
Generation
- URL: http://arxiv.org/abs/2001.09382v2
- Date: Thu, 27 Feb 2020 05:34:03 GMT
- Title: GraphAF: a Flow-based Autoregressive Model for Molecular Graph
Generation
- Authors: Chence Shi, Minkai Xu, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, Jian
Tang
- Abstract summary: We propose a flow-based autoregressive model for graph generation called GraphAF.
GraphAF combines the advantages of both autoregressive and flow-based approaches and enjoys: (1) high model flexibility for data density estimation; (2) efficient parallel computation for training; (3) an iterative sampling process, which allows leveraging chemical domain knowledge for valency checking.
- Score: 45.360695120154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Molecular graph generation is a fundamental problem for drug discovery and
has been attracting growing attention. The problem is challenging since it
requires not only generating chemically valid molecular structures but also
optimizing their chemical properties in the meantime. Inspired by the recent
progress in deep generative models, in this paper we propose a flow-based
autoregressive model for graph generation called GraphAF. GraphAF combines the
advantages of both autoregressive and flow-based approaches and enjoys: (1)
high model flexibility for data density estimation; (2) efficient parallel
computation for training; (3) an iterative sampling process, which allows
leveraging chemical domain knowledge for valency checking. Experimental results
show that GraphAF is able to generate 68% chemically valid molecules even
without chemical knowledge rules and 100% valid molecules with chemical rules.
The training process of GraphAF is two times faster than the existing
state-of-the-art approach GCPN. After fine-tuning the model for goal-directed
property optimization with reinforcement learning, GraphAF achieves
state-of-the-art performance on both chemical property optimization and
constrained property optimization.
Related papers
- Dynamic and Textual Graph Generation Via Large-Scale LLM-based Agent Simulation [70.60461609393779]
GraphAgent-Generator (GAG) is a novel simulation-based framework for dynamic graph generation.
Our framework effectively replicates seven macro-level structural characteristics in established network science theories.
It supports generating graphs with up to nearly 100,000 nodes or 10 million edges, with a minimum speed-up of 90.4%.
arXiv Detail & Related papers (2024-10-13T12:57:08Z) - Through the Dual-Prism: A Spectral Perspective on Graph Data
Augmentation for Graph Classification [71.36575018271405]
We introduce the Dual-Prism (DP) augmentation method, comprising DP-Noise and DP-Mask.
We find that keeping the low-frequency eigenvalues unchanged can preserve the critical properties at a large scale when generating augmented graphs.
arXiv Detail & Related papers (2024-01-18T12:58:53Z) - MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Graph Generation with Diffusion Mixture [57.78958552860948]
Generation of graphs is a major challenge for real-world tasks that require understanding the complex nature of their non-Euclidean structures.
We propose a generative framework that models the topology of graphs by explicitly learning the final graph structures of the diffusion process.
arXiv Detail & Related papers (2023-02-07T17:07:46Z) - Conditional Diffusion Based on Discrete Graph Structures for Molecular
Graph Generation [32.66694406638287]
We propose a Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation.
Specifically, we construct a forward graph diffusion process on both graph structures and inherent features through differential equations (SDE)
We present a specialized hybrid graph noise prediction model that extracts the global context and the local node-edge dependency from intermediate graph states.
arXiv Detail & Related papers (2023-01-01T15:24:15Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - GraphPiece: Efficiently Generating High-Quality Molecular Graph with
Substructures [7.021635649909492]
We propose a method to automatically discover common substructures, which we call em graph pieces, from given molecular graphs.
Based on graph pieces, we leverage a variational autoencoder to generate molecules in two phases: piece-level graph generation followed by bond completion.
arXiv Detail & Related papers (2021-06-29T05:26:18Z) - MoFlow: An Invertible Flow Model for Generating Molecular Graphs [19.829612234339578]
MoFlow is a flow-based graph generative model to learn invertible mappings between molecular graphs and latent representations.
Our model has merits including exact and tractable likelihood training, efficient one-pass embedding and generation, chemical validity guarantees, 100% reconstruction of training data, and good generalization ability.
arXiv Detail & Related papers (2020-06-17T20:14:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.