Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations
- URL: http://arxiv.org/abs/2306.01631v6
- Date: Sun, 16 Feb 2025 05:22:45 GMT
- Title: Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations
- Authors: Pengcheng Jiang, Cao Xiao, Tianfan Fu, Parminder Bhatia, Taha Kass-Hout, Jimeng Sun, Jiawei Han,
- Abstract summary: We introduce GODE, which accounts for the dual-level structure inherent in molecules.
Molecules possess an intrinsic graph structure and simultaneously function as nodes within a broader molecular knowledge graph.
By pre-training two GNNs on different graph structures, GODE effectively fuses molecular structures with their corresponding knowledge graph substructures.
- Score: 68.32093648671496
- License:
- Abstract: Molecular representation learning is vital for various downstream applications, including the analysis and prediction of molecular properties and side effects. While Graph Neural Networks (GNNs) have been a popular framework for modeling molecular data, they often struggle to capture the full complexity of molecular representations. In this paper, we introduce a novel method called GODE, which accounts for the dual-level structure inherent in molecules. Molecules possess an intrinsic graph structure and simultaneously function as nodes within a broader molecular knowledge graph. GODE integrates individual molecular graph representations with multi-domain biochemical data from knowledge graphs. By pre-training two GNNs on different graph structures and employing contrastive learning, GODE effectively fuses molecular structures with their corresponding knowledge graph substructures. This fusion yields a more robust and informative representation, enhancing molecular property predictions by leveraging both chemical and biological information. When fine-tuned across 11 chemical property tasks, our model significantly outperforms existing benchmarks, achieving an average ROC-AUC improvement of 12.7% for classification tasks and an average RMSE/MAE improvement of 34.4% for regression tasks. Notably, GODE surpasses the current leading model in property prediction, with advancements of 2.2% in classification and 7.2% in regression tasks.
Related papers
- Knowledge-aware contrastive heterogeneous molecular graph learning [77.94721384862699]
We propose a paradigm shift by encoding molecular graphs into Heterogeneous Molecular Graph Learning (KCHML)
KCHML conceptualizes molecules through three distinct graph views-molecular, elemental, and pharmacological-enhanced by heterogeneous molecular graphs and a dual message-passing mechanism.
This design offers a comprehensive representation for property prediction, as well as for downstream tasks such as drug-drug interaction (DDI) prediction.
arXiv Detail & Related papers (2025-02-17T11:53:58Z) - MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability [9.858315463084084]
MolGraph-xLSTM is a graph-based xLSTM model that enhances feature extraction and effectively models molecule long-range interactions.
Our approach processes molecular graphs at two scales: atom-level and motif-level.
We validate MolGraph-xLSTM on 10 molecular property prediction datasets, covering both classification and regression tasks.
arXiv Detail & Related papers (2025-01-30T15:47:59Z) - Molecular Graph Representation Learning via Structural Similarity Information [11.38130169319915]
We introduce the textbf Structural Similarity Motif GNN (MSSM-GNN), a novel molecular graph representation learning method.
In particular, we propose a specially designed graph that leverages graph kernel algorithms to represent the similarity between molecules quantitatively.
We employ GNNs to learn feature representations from molecular graphs, aiming to enhance the accuracy of property prediction by incorporating additional molecular representation information.
arXiv Detail & Related papers (2024-09-13T06:59:10Z) - Molecular Property Prediction Based on Graph Structure Learning [29.516479802217205]
We propose a graph structure learning (GSL) based MPP approach, called GSL-MPP.
Specifically, we first apply graph neural network (GNN) over molecular graphs to extract molecular representations.
With molecular fingerprints, we construct a molecular similarity graph (MSG)
arXiv Detail & Related papers (2023-12-28T06:45:13Z) - Extracting Molecular Properties from Natural Language with Multimodal
Contrastive Learning [1.3717673827807508]
We study how molecular property information can be transferred from natural language to graph representations.
We implement neural relevance scoring strategies to improve text retrieval, introduce a novel chemically-valid molecular graph augmentation strategy.
We achieve a +4.26% AUROC gain versus models pre-trained on the graph modality alone, and a +1.54% gain compared to recently proposed molecular graph/text contrastively trained MoMu model.
arXiv Detail & Related papers (2023-07-22T10:32:58Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - ASGN: An Active Semi-supervised Graph Neural Network for Molecular
Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules.
In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution.
At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.