Related papers: Graph-based Molecular Representation Learning

Graph-based Molecular Representation Learning

URL: http://arxiv.org/abs/2207.04869v3
Date: Wed, 29 Nov 2023 03:16:59 GMT
Title: Graph-based Molecular Representation Learning
Authors: Zhichun Guo, Kehan Guo, Bozhao Nan, Yijun Tian, Roshni G. Iyer, Yihong Ma, Olaf Wiest, Xiangliang Zhang, Wei Wang, Chuxu Zhang, Nitesh V. Chawla
Abstract summary: Molecular representation learning (MRL) is a key step to build the connection between machine learning and chemical science. Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning.
Score: 59.06193431883431
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Molecular representation learning (MRL) is a key step to build the connection between machine learning and chemical science. In particular, it encodes molecules as numerical vectors preserving the molecular structures and features, on top of which the downstream tasks (e.g., property prediction) can be performed. Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning. In this survey, we systematically review these graph-based molecular representation techniques, especially the methods incorporating chemical domain knowledge. Specifically, we first introduce the features of 2D and 3D molecular graphs. Then we summarize and categorize MRL methods into three groups based on their input. Furthermore, we discuss some typical chemical applications supported by MRL. To facilitate studies in this fast-developing area, we also list the benchmarks and commonly used datasets in the paper. Finally, we share our thoughts on future research directions.

Related papers

Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method. HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution. Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z)
Molecular Property Prediction Based on Graph Structure Learning [29.516479802217205]
We propose a graph structure learning (GSL) based MPP approach, called GSL-MPP. Specifically, we first apply graph neural network (GNN) over molecular graphs to extract molecular representations. With molecular fingerprints, we construct a molecular similarity graph (MSG)
arXiv Detail & Related papers (2023-12-28T06:45:13Z)
Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks [44.934084652800976]
We introduce the first MoleculAR Conformer Ensemble Learning benchmark to thoroughly evaluate the potential of learning on conformer ensembles. Our findings reveal that direct learning from an conformer space can improve performance on a variety of tasks and models.
arXiv Detail & Related papers (2023-09-29T20:06:46Z)
MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually. We treat all candidate atoms and bonds as nodes and put them in a graph. We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z)
Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules. By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures. When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z)
Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction. Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations. On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z)
Improving Molecular Pretraining with Complementary Featurizations [20.86159731100242]
Molecular pretraining is a paradigm to solve a variety of tasks in computational chemistry and drug discovery. We show that different featurization techniques convey chemical information differently. We propose a simple and effective MOlecular pretraining framework with COmplementary featurizations (MOCO)
arXiv Detail & Related papers (2022-09-29T21:11:09Z)
Chemical-Reaction-Aware Molecule Representation Learning [88.79052749877334]
We propose using chemical reactions to assist learning molecule representation. Our approach is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings. Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks.
arXiv Detail & Related papers (2021-09-21T00:08:43Z)
ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction [25.49976851499949]
We propose a novel Geometry Enhanced Molecular representation learning method (GEM) for Chemical Representation Learning (ChemRL) At first, we design a geometry-based GNN architecture that simultaneously models atoms, bonds, and bond angles in a molecule. On top of the devised GNN architecture, we propose several novel geometry-level self-supervised learning strategies to learn spatial knowledge.
arXiv Detail & Related papers (2021-06-11T02:35:53Z)
Advanced Graph and Sequence Neural Networks for Molecular Property Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations. Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data. Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.