Related papers: Motif-based Graph Self-Supervised Learning forMolecular Property Prediction

Motif-based Graph Self-Supervised Learning forMolecular Property Prediction

URL: http://arxiv.org/abs/2110.00987v1
Date: Sun, 3 Oct 2021 11:45:51 GMT
Title: Motif-based Graph Self-Supervised Learning forMolecular Property Prediction
Authors: Zaixi Zhang, Qi Liu, Hao Wang, Chengqiang Lu, Chee-Kong Lee
Abstract summary: Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks. Most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks. We propose a novel self-supervised motif generation framework for GNNs.
Score: 12.789013658551454
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Predicting molecular properties with data-driven methods has drawn much attention in recent years. Particularly, Graph Neural Networks (GNNs) have demonstrated remarkable success in various molecular generation and prediction tasks. In cases where labeled data is scarce, GNNs can be pre-trained on unlabeled molecular data to first learn the general semantic and structural information before being fine-tuned for specific tasks. However, most existing self-supervised pre-training frameworks for GNNs only focus on node-level or graph-level tasks. These approaches cannot capture the rich information in subgraphs or graph motifs. For example, functional groups (frequently-occurred subgraphs in molecular graphs) often carry indicative information about the molecular properties. To bridge this gap, we propose Motif-based Graph Self-supervised Learning (MGSSL) by introducing a novel self-supervised motif generation framework for GNNs. First, for motif extraction from molecular graphs, we design a molecule fragmentation method that leverages a retrosynthesis-based algorithm BRICS and additional rules for controlling the size of motif vocabulary. Second, we design a general motif-based generative pre-training framework in which GNNs are asked to make topological and label predictions. This generative framework can be implemented in two different ways, i.e., breadth-first or depth-first. Finally, to take the multi-scale information in molecular graphs into consideration, we introduce a multi-level self-supervised pre-training. Extensive experiments on various downstream benchmark tasks show that our methods outperform all state-of-the-art baselines.

Related papers

Neural Graph Pattern Machine [50.78679002846741]
We propose the Neural Graph Pattern Machine (GPM), a framework designed to learn directly from graph patterns. GPM efficiently extracts and encodes substructures while identifying the most relevant ones for downstream tasks.
arXiv Detail & Related papers (2025-01-30T20:37:47Z)
Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs) This framework provides a standardized setting to evaluate GNNs across diverse datasets. We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z)
Pre-training Graph Neural Networks on Molecules by Using Subgraph-Conditioned Graph Information Bottleneck [2.137573128343838]
This study aims to build a pre-trained Graph Neural Network (GNN) model on molecules without human annotations or prior knowledge. We propose a novel Subgraph-conditioned Graph Information Bottleneck, named S-CGIB, for pre-training GNNs to recognize core subgraphs (graph cores) and significant subgraphs.
arXiv Detail & Related papers (2024-12-20T05:52:30Z)
MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation [16.129359492539095]
Graph Neural Networks (GNNs) have shown remarkable success in molecular tasks, yet their interpretability remains challenging. Traditional model-level explanation methods like XGNN and GNNInterpreter often fail to identify valid substructures like rings, leading to questionable interpretability. We introduce an innovative textbfMotif-btextbfAsed textbfGNN textbfExplainer (MAGE) that uses motifs as fundamental units for generating explanations.
arXiv Detail & Related papers (2024-05-21T06:12:24Z)
Fragment-based Pretraining and Finetuning on Molecular Graphs [0.0]
Unlabeled molecular data has become abundant, which facilitates the rapid development of self-supervised learning for GNNs in the chemical domain. We propose pretraining GNNs at the fragment level, a promising middle ground to overcome the limitations of node-level and graph-level pretraining. Our graph fragment-based pretraining (GraphFP) advances the performances on 5 out of 8 common molecular benchmarks and improves the performances on long-range biological benchmarks by at least 11.5%.
arXiv Detail & Related papers (2023-10-05T03:01:09Z)
Will More Expressive Graph Neural Networks do Better on Generative Tasks? [27.412913421460388]
Graph Neural Network (GNN) architectures are often underexplored. We replace the underlying GNNs of graph generative models with more expressive GNNs. advanced GNNs can achieve state-of-the-art results across 17 other non-GNN-based graph generative approaches.
arXiv Detail & Related papers (2023-08-23T07:57:45Z)
GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning [71.89623260998934]
This study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting. Existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs. We propose GIMLET, which unifies language models for both graph and text data.
arXiv Detail & Related papers (2023-05-28T18:27:59Z)
MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT) MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt. Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z)
HiGNN: Hierarchical Informative Graph Neural Networks for Molecular Property Prediction Equipped with Feature-Wise Attention [5.735627221409312]
We propose a well-designed hierarchical informative graph neural networks framework (termed HiGNN) for predicting molecular property. Experiments demonstrate that HiGNN achieves state-of-the-art predictive performance on many challenging drug discovery-associated benchmark datasets.
arXiv Detail & Related papers (2022-08-30T05:16:15Z)
MolGraph: a Python package for the implementation of molecular graphs and graph neural networks with TensorFlow and Keras [51.92255321684027]
MolGraph is a graph neural network (GNN) package for molecular machine learning (ML) MolGraph implements a chemistry module to accommodate the generation of small molecular graphs, which can be passed to a GNN algorithm to solve a molecular ML problem. GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data.
arXiv Detail & Related papers (2022-08-21T18:37:41Z)
Graph neural networks for the prediction of molecular structure-property relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph. GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors. We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery. We propose a representation-first approach to molecular graph generation. We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z)
Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning. GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data. We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.