Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular
Graphs
- URL: http://arxiv.org/abs/2110.01717v1
- Date: Thu, 30 Sep 2021 22:09:28 GMT
- Title: Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular
Graphs
- Authors: Zhao Xu, Youzhi Luo, Xuan Zhang, Xinyi Xu, Yaochen Xie, Meng Liu,
Kaleb Dickerson, Cheng Deng, Maho Nakata, Shuiwang Ji
- Abstract summary: We develop a benchmark, known as Molecule3D, that includes a dataset with precise ground-state geometries of approximately 4 million molecules.
We implement two baseline methods that either predict the pairwise distance between atoms or atom coordinates in 3D space.
Our method can achieve comparable prediction accuracy but with much smaller computational costs.
- Score: 79.06686274377009
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks are emerging as promising methods for modeling
molecular graphs, in which nodes and edges correspond to atoms and chemical
bonds, respectively. Recent studies show that when 3D molecular geometries,
such as bond lengths and angles, are available, molecular property prediction
tasks can be made more accurate. However, computing of 3D molecular geometries
requires quantum calculations that are computationally prohibitive. For
example, accurate calculation of 3D geometries of a small molecule requires
hours of computing time using density functional theory (DFT). Here, we propose
to predict the ground-state 3D geometries from molecular graphs using machine
learning methods. To make this feasible, we develop a benchmark, known as
Molecule3D, that includes a dataset with precise ground-state geometries of
approximately 4 million molecules derived from DFT. We also provide a set of
software tools for data processing, splitting, training, and evaluation, etc.
Specifically, we propose to assess the error and validity of predicted
geometries using four metrics. We implement two baseline methods that either
predict the pairwise distance between atoms or atom coordinates in 3D space.
Experimental results show that, compared with generating 3D geometries with
RDKit, our method can achieve comparable prediction accuracy but with much
smaller computational costs. Our Molecule3D is available as a module of the
MoleculeX software library (https://github.com/divelab/MoleculeX).
Related papers
- Geometry Informed Tokenization of Molecules for Language Model Generation [85.80491667588923]
We consider molecule generation in 3D space using language models (LMs)
Although tokenization of molecular graphs exists, that for 3D geometries is largely unexplored.
We propose the Geo2Seq, which converts molecular geometries into $SE(3)$-invariant 1D discrete sequences.
arXiv Detail & Related papers (2024-08-19T16:09:59Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - 3D Molecular Geometry Analysis with 2D Graphs [79.47097907673877]
Ground-state 3D geometries of molecules are essential for many molecular analysis tasks.
Modern quantum mechanical methods can compute accurate 3D geometries but are computationally prohibitive.
We propose a novel deep learning framework to predict 3D geometries from molecular graphs.
arXiv Detail & Related papers (2023-05-01T19:00:46Z) - Geometry-Complete Diffusion for 3D Molecule Generation and Optimization [3.8366697175402225]
We introduce the Geometry-Complete Diffusion Model (GCDM) for 3D molecule generation.
GCDM outperforms existing 3D molecular diffusion models by significant margins across conditional and unconditional settings.
We also show that GCDM's geometric features can be repurposed to consistently optimize the geometry and chemical composition of existing 3D molecules.
arXiv Detail & Related papers (2023-02-08T20:01:51Z) - Unified 2D and 3D Pre-Training of Molecular Representations [237.36667670201473]
We propose a new representation learning method based on a unified 2D and 3D pre-training.
Atom coordinates and interatomic distances are encoded and then fused with atomic representations through graph neural networks.
Our method achieves state-of-the-art results on 10 tasks, and the average improvement on 2D-only tasks is 8.3%.
arXiv Detail & Related papers (2022-07-14T11:36:56Z) - 3D Infomax improves GNNs for Molecular Property Prediction [1.9703625025720701]
We propose pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs.
We show that 3D pre-training provides significant improvements for a wide range of properties.
arXiv Detail & Related papers (2021-10-08T13:30:49Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.