3D Infomax improves GNNs for Molecular Property Prediction
- URL: http://arxiv.org/abs/2110.04126v1
- Date: Fri, 8 Oct 2021 13:30:49 GMT
- Title: 3D Infomax improves GNNs for Molecular Property Prediction
- Authors: Hannes St\"ark, Dominique Beaini, Gabriele Corso, Prudencio Tossou,
Christian Dallago, Stephan G\"unnemann, Pietro Li\`o
- Abstract summary: We propose pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs.
We show that 3D pre-training provides significant improvements for a wide range of properties.
- Score: 1.9703625025720701
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Molecular property prediction is one of the fastest-growing applications of
deep learning with critical real-world impacts. Including 3D molecular
structure as input to learned models their performance for many molecular
tasks. However, this information is infeasible to compute at the scale required
by several real-world applications. We propose pre-training a model to reason
about the geometry of molecules given only their 2D molecular graphs. Using
methods from self-supervised learning, we maximize the mutual information
between 3D summary vectors and the representations of a Graph Neural Network
(GNN) such that they contain latent 3D information. During fine-tuning on
molecules with unknown geometry, the GNN still generates implicit 3D
information and can use it to improve downstream tasks. We show that 3D
pre-training provides significant improvements for a wide range of properties,
such as a 22% average MAE reduction on eight quantum mechanical properties.
Moreover, the learned representations can be effectively transferred between
datasets in different molecular spaces.
Related papers
- Geometry-aware Line Graph Transformer Pre-training for Molecular
Property Prediction [4.598522704308923]
Geometry-aware line graph transformer (Galformer) pre-training is a novel self-supervised learning framework.
Galformer consistently outperforms all baselines on both classification and regression tasks.
arXiv Detail & Related papers (2023-09-01T14:20:48Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - 3D Molecular Geometry Analysis with 2D Graphs [79.47097907673877]
Ground-state 3D geometries of molecules are essential for many molecular analysis tasks.
Modern quantum mechanical methods can compute accurate 3D geometries but are computationally prohibitive.
We propose a novel deep learning framework to predict 3D geometries from molecular graphs.
arXiv Detail & Related papers (2023-05-01T19:00:46Z) - Geometry-Complete Diffusion for 3D Molecule Generation and Optimization [3.8366697175402225]
We introduce the Geometry-Complete Diffusion Model (GCDM) for 3D molecule generation.
GCDM outperforms existing 3D molecular diffusion models by significant margins across conditional and unconditional settings.
We also show that GCDM's geometric features can be repurposed to consistently optimize the geometry and chemical composition of existing 3D molecules.
arXiv Detail & Related papers (2023-02-08T20:01:51Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Graph Neural Networks for Molecules [9.04563945965023]
This review introduces GNNs and their various applications for small organic molecules.
GNNs rely on message-passing operations, a generic yet powerful framework, to update node features iteratively.
arXiv Detail & Related papers (2022-09-12T20:10:07Z) - Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular
Graphs [79.06686274377009]
We develop a benchmark, known as Molecule3D, that includes a dataset with precise ground-state geometries of approximately 4 million molecules.
We implement two baseline methods that either predict the pairwise distance between atoms or atom coordinates in 3D space.
Our method can achieve comparable prediction accuracy but with much smaller computational costs.
arXiv Detail & Related papers (2021-09-30T22:09:28Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z) - Molecular machine learning with conformer ensembles [0.0]
We introduce multiple deep learning models that expand upon key architectures such as ChemProp and Schnet.
We then benchmark the performance trade-offs of these models on 2D, 3D and 4D representations in the prediction of drug activity.
The new architectures perform significantly better than 2D models, but their performance is often just as strong with a single conformer as with many.
arXiv Detail & Related papers (2020-12-15T17:44:48Z) - ATOM3D: Tasks On Molecules in Three Dimensions [91.72138447636769]
Deep neural networks have recently gained significant attention.
In this work we present ATOM3D, a collection of both novel and existing datasets spanning several key classes of biomolecules.
We develop three-dimensional molecular learning networks for each of these tasks, finding that they consistently improve performance.
arXiv Detail & Related papers (2020-12-07T20:18:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.