MolHF: A Hierarchical Normalizing Flow for Molecular Graph Generation
- URL: http://arxiv.org/abs/2305.08457v1
- Date: Mon, 15 May 2023 08:59:35 GMT
- Title: MolHF: A Hierarchical Normalizing Flow for Molecular Graph Generation
- Authors: Yiheng Zhu, Zhenqiu Ouyang, Ben Liao, Jialu Wu, Yixuan Wu, Chang-Yu
Hsieh, Tingjun Hou, Jian Wu
- Abstract summary: MolHF is a new hierarchical flow-based model that generates molecular graphs in a coarse-to-fine manner.
MolHF is the first flow-based model that can be applied to model larger molecules (polymer) with more than 100 heavy atoms.
- Score: 4.517805235253331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Molecular de novo design is a critical yet challenging task in scientific
fields, aiming to design novel molecular structures with desired property
profiles. Significant progress has been made by resorting to generative models
for graphs. However, limited attention is paid to hierarchical generative
models, which can exploit the inherent hierarchical structure (with rich
semantic information) of the molecular graphs and generate complex molecules of
larger size that we shall demonstrate to be difficult for most existing models.
The primary challenge to hierarchical generation is the non-differentiable
issue caused by the generation of intermediate discrete coarsened graph
structures. To sidestep this issue, we cast the tricky hierarchical generation
problem over discrete spaces as the reverse process of hierarchical
representation learning and propose MolHF, a new hierarchical flow-based model
that generates molecular graphs in a coarse-to-fine manner. Specifically, MolHF
first generates bonds through a multi-scale architecture, then generates atoms
based on the coarsened graph structure at each scale. We demonstrate that MolHF
achieves state-of-the-art performance in random generation and property
optimization, implying its high capacity to model data distribution.
Furthermore, MolHF is the first flow-based model that can be applied to model
larger molecules (polymer) with more than 100 heavy atoms. The code and models
are available at https://github.com/violet-sto/MolHF.
Related papers
- GraphXForm: Graph transformer for computer-aided molecular design with application to extraction [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned.
We evaluate it on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - MAGNet: Motif-Agnostic Generation of Molecules from Shapes [16.188301768974]
MAGNet is a graph-based model that generates abstract shapes before allocating atom and bond types.
We demonstrate that MAGNet's improved expressivity leads to molecules with more topologically distinct structures.
arXiv Detail & Related papers (2023-05-30T15:29:34Z) - An Equivariant Generative Framework for Molecular Graph-Structure
Co-Design [54.92529253182004]
We present MolCode, a machine learning-based generative framework for underlineMolecular graph-structure underlineCo-design.
In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure.
Our investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design.
arXiv Detail & Related papers (2023-04-12T13:34:22Z) - Graph Generation with Diffusion Mixture [57.78958552860948]
Generation of graphs is a major challenge for real-world tasks that require understanding the complex nature of their non-Euclidean structures.
We propose a generative framework that models the topology of graphs by explicitly learning the final graph structures of the diffusion process.
arXiv Detail & Related papers (2023-02-07T17:07:46Z) - MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular
Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT)
MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt.
Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - GraphPiece: Efficiently Generating High-Quality Molecular Graph with
Substructures [7.021635649909492]
We propose a method to automatically discover common substructures, which we call em graph pieces, from given molecular graphs.
Based on graph pieces, we leverage a variational autoencoder to generate molecules in two phases: piece-level graph generation followed by bond completion.
arXiv Detail & Related papers (2021-06-29T05:26:18Z) - MolGrow: A Graph Normalizing Flow for Hierarchical Molecular Generation [9.594432031144716]
We propose a hierarchical normalizing flow model for generating molecular graphs.
The model produces new molecular structures from a single-node graph by splitting every node into two.
We show successful experiments on global and constrained optimization of chemical properties using latent codes of the model.
arXiv Detail & Related papers (2021-02-03T17:48:52Z) - Conditional Constrained Graph Variational Autoencoders for Molecule
Design [70.59828655929194]
We present Conditional Constrained Graph Variational Autoencoder (CCGVAE), a model that implements this key-idea in a state-of-the-art model.
We show improved results on several evaluation metrics on two commonly adopted datasets for molecule generation.
arXiv Detail & Related papers (2020-09-01T21:58:07Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z) - Hierarchical Generation of Molecular Graphs using Structural Motifs [38.637412590671865]
We propose a new hierarchical graph encoder-decoder that employs significantly larger and more flexible graph motifs as basic building blocks.
Our encoder produces a multi-resolution representation for each molecule in a fine-to-coarse fashion, from atoms to connected motifs.
We evaluate our model on multiple molecule generation tasks, including polymers, and show that our model significantly outperforms previous state-of-the-art baselines.
arXiv Detail & Related papers (2020-02-08T21:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.