CTAGE: Curvature-Based Topology-Aware Graph Embedding for Learning
Molecular Representations
- URL: http://arxiv.org/abs/2307.13275v2
- Date: Thu, 18 Jan 2024 15:14:42 GMT
- Title: CTAGE: Curvature-Based Topology-Aware Graph Embedding for Learning
Molecular Representations
- Authors: Yili Chen, Zhengyu Li, Zheng Wan, Hui Yu, Xian Wei
- Abstract summary: We propose an embedding approach CTAGE, utilizing $k$-hop discrete Ricci curvature to extract structural insights from molecular graph data.
Results indicate that introducing node curvature significantly improves the performance of current graph neural network frameworks.
- Score: 11.12640831521393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI-driven drug design relies significantly on predicting molecular
properties, which is a complex task. In current approaches, the most commonly
used feature representations for training deep neural network models are based
on SMILES and molecular graphs. While these methods are concise and efficient,
they have limitations in capturing complex spatial information. Recently,
researchers have recognized the importance of incorporating three-dimensional
information of molecular structures into models. However, capturing spatial
information requires the introduction of additional units in the generator,
bringing additional design and computational costs. Therefore, it is necessary
to develop a method for predicting molecular properties that effectively
combines spatial structural information while maintaining the simplicity and
efficiency of graph neural networks. In this work, we propose an embedding
approach CTAGE, utilizing $k$-hop discrete Ricci curvature to extract
structural insights from molecular graph data. This effectively integrates
spatial structural information while preserving the training complexity of the
network. Experimental results indicate that introducing node curvature
significantly improves the performance of current graph neural network
frameworks, validating that the information from k-hop node curvature
effectively reflects the relationship between molecular structure and function.
Related papers
- Molecular Graph Representation Learning via Structural Similarity Information [11.38130169319915]
We introduce the textbf Structural Similarity Motif GNN (MSSM-GNN), a novel molecular graph representation learning method.
In particular, we propose a specially designed graph that leverages graph kernel algorithms to represent the similarity between molecules quantitatively.
We employ GNNs to learn feature representations from molecular graphs, aiming to enhance the accuracy of property prediction by incorporating additional molecular representation information.
arXiv Detail & Related papers (2024-09-13T06:59:10Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular
Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT)
MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt.
Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z) - Extreme Acceleration of Graph Neural Network-based Prediction Models for
Quantum Chemistry [7.592530794455257]
We present a novel hardware-software co-design approach to scale up the training of graph neural networks for molecular property prediction.
We introduce an algorithm to coalesce the batches of molecular graphs into fixed size packs to eliminate redundant computation and memory.
We demonstrate that such a co-design approach can reduce the training time of such molecular property prediction models from days to less than two hours.
arXiv Detail & Related papers (2022-11-25T01:30:18Z) - HiGNN: Hierarchical Informative Graph Neural Networks for Molecular
Property Prediction Equipped with Feature-Wise Attention [5.735627221409312]
We propose a well-designed hierarchical informative graph neural networks framework (termed HiGNN) for predicting molecular property.
Experiments demonstrate that HiGNN achieves state-of-the-art predictive performance on many challenging drug discovery-associated benchmark datasets.
arXiv Detail & Related papers (2022-08-30T05:16:15Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - Simple and Efficient Heterogeneous Graph Neural Network [55.56564522532328]
Heterogeneous graph neural networks (HGNNs) have powerful capability to embed rich structural and semantic information of a heterogeneous graph into node representations.
Existing HGNNs inherit many mechanisms from graph neural networks (GNNs) over homogeneous graphs, especially the attention mechanism and the multi-layer structure.
This paper conducts an in-depth and detailed study of these mechanisms and proposes Simple and Efficient Heterogeneous Graph Neural Network (SeHGNN)
arXiv Detail & Related papers (2022-07-06T10:01:46Z) - Equivariant Graph Attention Networks for Molecular Property Prediction [0.34376560669160383]
Learning about 3D molecular structures with varying size is an emerging challenge in machine learning and especially in drug discovery.
We propose an equivariant Graph Neural Networks (GNN) that operates with Cartesian coordinates to incorporate directionality.
We demonstrate the efficacy of our architecture on predicting quantum mechanical properties of small molecules and its benefit on problems that concern macromolecular structures such as protein complexes.
arXiv Detail & Related papers (2022-02-20T19:07:29Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - Distance-aware Molecule Graph Attention Network for Drug-Target Binding
Affinity Prediction [54.93890176891602]
We propose a diStance-aware Molecule graph Attention Network (S-MAN) tailored to drug-target binding affinity prediction.
As a dedicated solution, we first propose a position encoding mechanism to integrate the topological structure and spatial position information into the constructed pocket-ligand graph.
We also propose a novel edge-node hierarchical attentive aggregation structure which has edge-level aggregation and node-level aggregation.
arXiv Detail & Related papers (2020-12-17T17:44:01Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.