Motif-based Graph Self-Supervised Learning forMolecular Property
Prediction
- URL: http://arxiv.org/abs/2110.00987v1
- Date: Sun, 3 Oct 2021 11:45:51 GMT
- Title: Motif-based Graph Self-Supervised Learning forMolecular Property
Prediction
- Authors: Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, Chee-Kong Lee
- Abstract summary: Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks.
Most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks.
We propose a novel self-supervised motif generation framework for GNNs.
- Score: 12.789013658551454
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Predicting molecular properties with data-driven methods has drawn much
attention in recent years. Particularly, Graph Neural Networks (GNNs) have
demonstrated remarkable success in various molecular generation and prediction
tasks. In cases where labeled data is scarce, GNNs can be pre-trained on
unlabeled molecular data to first learn the general semantic and structural
information before being fine-tuned for specific tasks. However, most existing
self-supervised pre-training frameworks for GNNs only focus on node-level or
graph-level tasks. These approaches cannot capture the rich information in
subgraphs or graph motifs. For example, functional groups (frequently-occurred
subgraphs in molecular graphs) often carry indicative information about the
molecular properties. To bridge this gap, we propose Motif-based Graph
Self-supervised Learning (MGSSL) by introducing a novel self-supervised motif
generation framework for GNNs. First, for motif extraction from molecular
graphs, we design a molecule fragmentation method that leverages a
retrosynthesis-based algorithm BRICS and additional rules for controlling the
size of motif vocabulary. Second, we design a general motif-based generative
pre-training framework in which GNNs are asked to make topological and label
predictions. This generative framework can be implemented in two different
ways, i.e., breadth-first or depth-first. Finally, to take the multi-scale
information in molecular graphs into consideration, we introduce a multi-level
self-supervised pre-training. Extensive experiments on various downstream
benchmark tasks show that our methods outperform all state-of-the-art
baselines.
Related papers
- Neural Graph Pattern Machine [50.78679002846741]
We propose the Neural Graph Pattern Machine (GPM), a framework designed to learn directly from graph patterns.
GPM efficiently extracts and encodes substructures while identifying the most relevant ones for downstream tasks.
arXiv Detail & Related papers (2025-01-30T20:37:47Z) - Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)
This framework provides a standardized setting to evaluate GNNs across diverse datasets.
We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Pre-training Graph Neural Networks on Molecules by Using Subgraph-Conditioned Graph Information Bottleneck [2.137573128343838]
This study aims to build a pre-trained Graph Neural Network (GNN) model on molecules without human annotations or prior knowledge.
We propose a novel Subgraph-conditioned Graph Information Bottleneck, named S-CGIB, for pre-training GNNs to recognize core subgraphs (graph cores) and significant subgraphs.
arXiv Detail & Related papers (2024-12-20T05:52:30Z) - MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation [16.129359492539095]
Graph Neural Networks (GNNs) have shown remarkable success in molecular tasks, yet their interpretability remains challenging.
Traditional model-level explanation methods like XGNN and GNNInterpreter often fail to identify valid substructures like rings, leading to questionable interpretability.
We introduce an innovative textbfMotif-btextbfAsed textbfGNN textbfExplainer (MAGE) that uses motifs as fundamental units for generating explanations.
arXiv Detail & Related papers (2024-05-21T06:12:24Z) - Fragment-based Pretraining and Finetuning on Molecular Graphs [0.0]
Unlabeled molecular data has become abundant, which facilitates the rapid development of self-supervised learning for GNNs in the chemical domain.
We propose pretraining GNNs at the fragment level, a promising middle ground to overcome the limitations of node-level and graph-level pretraining.
Our graph fragment-based pretraining (GraphFP) advances the performances on 5 out of 8 common molecular benchmarks and improves the performances on long-range biological benchmarks by at least 11.5%.
arXiv Detail & Related papers (2023-10-05T03:01:09Z) - MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular
Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT)
MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt.
Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z) - HiGNN: Hierarchical Informative Graph Neural Networks for Molecular
Property Prediction Equipped with Feature-Wise Attention [5.735627221409312]
We propose a well-designed hierarchical informative graph neural networks framework (termed HiGNN) for predicting molecular property.
Experiments demonstrate that HiGNN achieves state-of-the-art predictive performance on many challenging drug discovery-associated benchmark datasets.
arXiv Detail & Related papers (2022-08-30T05:16:15Z) - MolGraph: a Python package for the implementation of molecular graphs
and graph neural networks with TensorFlow and Keras [51.92255321684027]
MolGraph is a graph neural network (GNN) package for molecular machine learning (ML)
MolGraph implements a chemistry module to accommodate the generation of small molecular graphs, which can be passed to a GNN algorithm to solve a molecular ML problem.
GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data.
arXiv Detail & Related papers (2022-08-21T18:37:41Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.