Augmenting Molecular Graphs with Geometries via Machine Learning Interatomic Potentials
- URL: http://arxiv.org/abs/2507.00407v1
- Date: Tue, 01 Jul 2025 03:49:11 GMT
- Title: Augmenting Molecular Graphs with Geometries via Machine Learning Interatomic Potentials
- Authors: Cong Fu, Yuchao Lin, Zachary Krueger, Haiyang Yu, Maho Nakata, Jianwen Xie, Emine Kucukbenli, Xiaofeng Qian, Shuiwang Ji,
- Abstract summary: Accurate molecular property predictions require 3D geometries, which are typically obtained using expensive methods such as density functional theory (DFT)<n>Here, we attempt to obtain molecular geometries by relying solely on machine learning interatomic potential (MLIP) models.<n>MLIP foundation models are trained with supervised learning to predict energy and forces given 3D molecular structures.
- Score: 63.945006006152035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate molecular property predictions require 3D geometries, which are typically obtained using expensive methods such as density functional theory (DFT). Here, we attempt to obtain molecular geometries by relying solely on machine learning interatomic potential (MLIP) models. To this end, we first curate a large-scale molecular relaxation dataset comprising 3.5 million molecules and 300 million snapshots. Then MLIP foundation models are trained with supervised learning to predict energy and forces given 3D molecular structures. Once trained, we show that the foundation models can be used in different ways to obtain geometries either explicitly or implicitly. First, it can be used to obtain low-energy 3D geometries via geometry optimization, providing relaxed 3D geometries for downstream molecular property predictions. To mitigate potential biases and enhance downstream predictions, we introduce geometry fine-tuning based on the relaxed 3D geometries. Second, the foundation models can be directly fine-tuned for property prediction when ground truth 3D geometries are available. Our results demonstrate that MLIP foundation models trained on relaxation data can provide valuable molecular geometries that benefit property predictions.
Related papers
- A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems [87.30652640973317]
Recent advances in computational modelling of atomic systems represent them as geometric graphs with atoms embedded as nodes in 3D Euclidean space.
Geometric Graph Neural Networks have emerged as the preferred machine learning architecture powering applications ranging from protein structure prediction to molecular simulations and material generation.
This paper provides a comprehensive and self-contained overview of the field of Geometric GNNs for 3D atomic systems.
arXiv Detail & Related papers (2023-12-12T18:44:19Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - 3D Molecular Geometry Analysis with 2D Graphs [79.47097907673877]
Ground-state 3D geometries of molecules are essential for many molecular analysis tasks.
Modern quantum mechanical methods can compute accurate 3D geometries but are computationally prohibitive.
We propose a novel deep learning framework to predict 3D geometries from molecular graphs.
arXiv Detail & Related papers (2023-05-01T19:00:46Z) - Geometry-Complete Diffusion for 3D Molecule Generation and Optimization [3.8366697175402225]
We introduce the Geometry-Complete Diffusion Model (GCDM) for 3D molecule generation.
GCDM outperforms existing 3D molecular diffusion models by significant margins across conditional and unconditional settings.
We also show that GCDM's geometric features can be repurposed to consistently optimize the geometry and chemical composition of existing 3D molecules.
arXiv Detail & Related papers (2023-02-08T20:01:51Z) - Non-equilibrium molecular geometries in graph neural networks [2.6040244706888998]
Graph neural networks have become a powerful framework for learning complex structure-property relationships.
Recently proposed methods have demonstrated that using 3D geometry information of the molecule along with the bonding structure can lead to more accurate prediction on a wide range of properties.
arXiv Detail & Related papers (2022-03-07T20:20:52Z) - Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular
Graphs [79.06686274377009]
We develop a benchmark, known as Molecule3D, that includes a dataset with precise ground-state geometries of approximately 4 million molecules.
We implement two baseline methods that either predict the pairwise distance between atoms or atom coordinates in 3D space.
Our method can achieve comparable prediction accuracy but with much smaller computational costs.
arXiv Detail & Related papers (2021-09-30T22:09:28Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.