Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular
Property Prediction
- URL: http://arxiv.org/abs/2309.01788v1
- Date: Mon, 4 Sep 2023 19:59:51 GMT
- Title: Hierarchical Grammar-Induced Geometry for Data-Efficient Molecular
Property Prediction
- Authors: Minghao Guo, Veronika Thost, Samuel W Song, Adithya Balachandran,
Payel Das, Jie Chen, Wojciech Matusik
- Abstract summary: We propose a data-efficient property predictor by utilizing a learnable hierarchical molecular grammar.
The property prediction is performed using graph neural diffusion over the grammar-induced geometry.
We include a detailed ablation study and further analysis of our solution, showing its effectiveness in cases with extremely limited data.
- Score: 37.443491843178315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The prediction of molecular properties is a crucial task in the field of
material and drug discovery. The potential benefits of using deep learning
techniques are reflected in the wealth of recent literature. Still, these
techniques are faced with a common challenge in practice: Labeled data are
limited by the cost of manual extraction from literature and laborious
experimentation. In this work, we propose a data-efficient property predictor
by utilizing a learnable hierarchical molecular grammar that can generate
molecules from grammar production rules. Such a grammar induces an explicit
geometry of the space of molecular graphs, which provides an informative prior
on molecular structural similarity. The property prediction is performed using
graph neural diffusion over the grammar-induced geometry. On both small and
large datasets, our evaluation shows that this approach outperforms a wide
spectrum of baselines, including supervised and pre-trained graph neural
networks. We include a detailed ablation study and further analysis of our
solution, showing its effectiveness in cases with extremely limited data. Code
is available at https://github.com/gmh14/Geo-DEG.
Related papers
- Graph Residual based Method for Molecular Property Prediction [0.7499722271664147]
This manuscript highlights a detailed description of the novel GRU-based methodology, ECRGNN, to map the inputs that have been used.
A detailed description of the Variational Autoencoder (VAE) and the end-to-end learning method used for multi-class multi-label property prediction has been provided as well.
arXiv Detail & Related papers (2024-07-27T09:01:36Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Extreme Acceleration of Graph Neural Network-based Prediction Models for
Quantum Chemistry [7.592530794455257]
We present a novel hardware-software co-design approach to scale up the training of graph neural networks for molecular property prediction.
We introduce an algorithm to coalesce the batches of molecular graphs into fixed size packs to eliminate redundant computation and memory.
We demonstrate that such a co-design approach can reduce the training time of such molecular property prediction models from days to less than two hours.
arXiv Detail & Related papers (2022-11-25T01:30:18Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - Graph-in-Graph (GiG): Learning interpretable latent graphs in
non-Euclidean domain for biological and healthcare applications [52.65389473899139]
Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain.
Recent works have shown that considering relationships between input data samples have a positive regularizing effect for the downstream task.
We propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications.
arXiv Detail & Related papers (2022-04-01T10:01:37Z) - Data-Efficient Graph Grammar Learning for Molecular Generation [41.936515793383]
We propose a data-efficient generative model that can be learned from datasets with orders of smaller magnitude sizes than common benchmarks.
Our learned graph grammar yields state-of-the-art results on generating high-quality molecules for three monomer datasets.
Our approach also achieves remarkable performance in a challenging polymer generation task with only $117$ training samples.
arXiv Detail & Related papers (2022-03-15T16:14:30Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - GeomGCL: Geometric Graph Contrastive Learning for Molecular Property
Prediction [47.70253904390288]
We propose a novel graph contrastive learning method utilizing the geometry of a molecule across 2D and 3D views.
Specifically, we first devise a dual-view geometric message passing network (GeomMPNN) to adaptively leverage the rich information of both 2D and 3D graphs of a molecule.
arXiv Detail & Related papers (2021-09-24T03:55:27Z) - Reinforced Molecular Optimization with Neighborhood-Controlled Grammars [63.84003497770347]
We propose MNCE-RL, a graph convolutional policy network for molecular optimization.
We extend the original neighborhood-controlled embedding grammars to make them applicable to molecular graph generation.
We show that our approach achieves state-of-the-art performance in a diverse range of molecular optimization tasks.
arXiv Detail & Related papers (2020-11-14T05:42:15Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.