Geometry-aware Line Graph Transformer Pre-training for Molecular
Property Prediction
- URL: http://arxiv.org/abs/2309.00483v1
- Date: Fri, 1 Sep 2023 14:20:48 GMT
- Title: Geometry-aware Line Graph Transformer Pre-training for Molecular
Property Prediction
- Authors: Peizhen Bai, Xianyuan Liu, Haiping Lu
- Abstract summary: Geometry-aware line graph transformer (Galformer) pre-training is a novel self-supervised learning framework.
Galformer consistently outperforms all baselines on both classification and regression tasks.
- Score: 4.598522704308923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Molecular property prediction with deep learning has gained much attention
over the past years. Owing to the scarcity of labeled molecules, there has been
growing interest in self-supervised learning methods that learn generalizable
molecular representations from unlabeled data. Molecules are typically treated
as 2D topological graphs in modeling, but it has been discovered that their 3D
geometry is of great importance in determining molecular functionalities. In
this paper, we propose the Geometry-aware line graph transformer (Galformer)
pre-training, a novel self-supervised learning framework that aims to enhance
molecular representation learning with 2D and 3D modalities. Specifically, we
first design a dual-modality line graph transformer backbone to encode the
topological and geometric information of a molecule. The designed backbone
incorporates effective structural encodings to capture graph structures from
both modalities. Then we devise two complementary pre-training tasks at the
inter and intra-modality levels. These tasks provide properly supervised
information and extract discriminative 2D and 3D knowledge from unlabeled
molecules. Finally, we evaluate Galformer against six state-of-the-art
baselines on twelve property prediction benchmarks via downstream fine-tuning.
Experimental results show that Galformer consistently outperforms all baselines
on both classification and regression tasks, demonstrating its effectiveness.
Related papers
- Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - An Equivariant Generative Framework for Molecular Graph-Structure
Co-Design [54.92529253182004]
We present MolCode, a machine learning-based generative framework for underlineMolecular graph-structure underlineCo-design.
In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure.
Our investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design.
arXiv Detail & Related papers (2023-04-12T13:34:22Z) - 3D Infomax improves GNNs for Molecular Property Prediction [1.9703625025720701]
We propose pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs.
We show that 3D pre-training provides significant improvements for a wide range of properties.
arXiv Detail & Related papers (2021-10-08T13:30:49Z) - GeomGCL: Geometric Graph Contrastive Learning for Molecular Property
Prediction [47.70253904390288]
We propose a novel graph contrastive learning method utilizing the geometry of a molecule across 2D and 3D views.
Specifically, we first devise a dual-view geometric message passing network (GeomMPNN) to adaptively leverage the rich information of both 2D and 3D graphs of a molecule.
arXiv Detail & Related papers (2021-09-24T03:55:27Z) - ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for
Property Prediction [25.49976851499949]
We propose a novel Geometry Enhanced Molecular representation learning method (GEM) for Chemical Representation Learning (ChemRL)
At first, we design a geometry-based GNN architecture that simultaneously models atoms, bonds, and bond angles in a molecule.
On top of the devised GNN architecture, we propose several novel geometry-level self-supervised learning strategies to learn spatial knowledge.
arXiv Detail & Related papers (2021-06-11T02:35:53Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z) - Molecular machine learning with conformer ensembles [0.0]
We introduce multiple deep learning models that expand upon key architectures such as ChemProp and Schnet.
We then benchmark the performance trade-offs of these models on 2D, 3D and 4D representations in the prediction of drug activity.
The new architectures perform significantly better than 2D models, but their performance is often just as strong with a single conformer as with many.
arXiv Detail & Related papers (2020-12-15T17:44:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.