MolCLR: Molecular Contrastive Learning of Representations via Graph
Neural Networks
- URL: http://arxiv.org/abs/2102.10056v1
- Date: Fri, 19 Feb 2021 17:35:18 GMT
- Title: MolCLR: Molecular Contrastive Learning of Representations via Graph
Neural Networks
- Authors: Yuyang Wang, Jianren Wang, Zhonglin Cao, Amir Barati Farimani
- Abstract summary: MolCLR is a self-supervised learning framework for large unlabeled molecule datasets.
We propose three novel molecule graph augmentations: atom masking, bond deletion, and subgraph removal.
Our method achieves state-of-the-art performance on many challenging datasets.
- Score: 11.994553575596228
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Molecular machine learning bears promise for efficient molecule property
prediction and drug discovery. However, due to the limited labeled data and the
giant chemical space, machine learning models trained via supervised learning
perform poorly in generalization. This greatly limits the applications of
machine learning methods for molecular design and discovery. In this work, we
present MolCLR: Molecular Contrastive Learning of Representations via Graph
Neural Networks (GNNs), a self-supervised learning framework for large
unlabeled molecule datasets. Specifically, we first build a molecular graph,
where each node represents an atom and each edge represents a chemical bond. A
GNN is then used to encode the molecule graph. We propose three novel molecule
graph augmentations: atom masking, bond deletion, and subgraph removal. A
contrastive estimator is utilized to maximize the agreement of different graph
augmentations from the same molecule. Experiments show that molecule
representations learned by MolCLR can be transferred to multiple downstream
molecular property prediction tasks. Our method thus achieves state-of-the-art
performance on many challenging datasets. We also prove the efficiency of our
proposed molecule graph augmentations on supervised molecular classification
tasks.
Related papers
- Molecular Property Prediction Based on Graph Structure Learning [29.516479802217205]
We propose a graph structure learning (GSL) based MPP approach, called GSL-MPP.
Specifically, we first apply graph neural network (GNN) over molecular graphs to extract molecular representations.
With molecular fingerprints, we construct a molecular similarity graph (MSG)
arXiv Detail & Related papers (2023-12-28T06:45:13Z) - MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures [2.5563339057415218]
MolIG is a novel MultiModaL molecular pre-training framework for predicting molecular properties based on Image and Graph structures.
It amalgamates the strengths of both molecular representation forms.
It exhibits enhanced performance in downstream tasks pertaining to molecular property prediction within benchmark groups.
arXiv Detail & Related papers (2023-11-28T10:28:35Z) - MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Conditional Graph Information Bottleneck for Molecular Relational
Learning [9.56625683182106]
We propose a novel relational learning framework, CGIB, that predicts the interaction behavior between a pair of graphs by detecting core subgraphs therein.
Our proposed method mimics the nature of chemical reactions, i.e., the core substructure of a molecule varies depending on which other molecule it interacts with.
arXiv Detail & Related papers (2023-04-29T01:17:43Z) - Molecular Contrastive Learning with Chemical Element Knowledge Graph [16.136921143416927]
Molecular representation learning contributes to multiple downstream tasks such as molecular property prediction and drug design.
We construct a Chemical Element Knowledge Graph (KG) to summarize microscopic associations between elements.
The first module, knowledge-guided graph augmentation, augments the original molecular graph based on the Chemical Element KG.
The second module, knowledge-aware graph representation, extracts molecular representations with a common graph encoder for the original molecular graph and a Knowledge-aware Message Passing Neural Network (KMPNN) to encode complex information in the augmented molecular graph.
arXiv Detail & Related papers (2021-12-01T15:04:39Z) - Chemical-Reaction-Aware Molecule Representation Learning [88.79052749877334]
We propose using chemical reactions to assist learning molecule representation.
Our approach is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings.
Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks.
arXiv Detail & Related papers (2021-09-21T00:08:43Z) - Advanced Graph and Sequence Neural Networks for Molecular Property
Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations.
Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data.
Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z) - Heterogeneous Molecular Graph Neural Networks for Predicting Molecule
Properties [12.897488702184306]
We introduce a novel graph representation of molecules, heterogeneous molecular graph (HMG)
HMGNN incorporates global molecule representations and an attention mechanism into the prediction process.
Our model achieves state-of-the-art performance in 9 out of 12 tasks on the QM9 dataset.
arXiv Detail & Related papers (2020-09-26T23:29:41Z) - ASGN: An Active Semi-supervised Graph Neural Network for Molecular
Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules.
In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution.
At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.