Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
- URL: http://arxiv.org/abs/2402.01975v3
- Date: Mon, 19 Aug 2024 21:42:53 GMT
- Title: Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
- Authors: Duy M. H. Nguyen, Nina Lukashina, Tai Nguyen, An T. Le, TrungTin Nguyen, Nhat Ho, Jan Peters, Daniel Sonntag, Viktor Zaverkin, Mathias Niepert,
- Abstract summary: A molecule's 2D representation consists of its atoms, their attributes, and the molecule's covalent bonds.
A 3D representation of a molecule is called a conformer and consists of its atom types and Cartesian coordinates.
- Score: 43.80038907470173
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A molecule's 2D representation consists of its atoms, their attributes, and the molecule's covalent bonds. A 3D (geometric) representation of a molecule is called a conformer and consists of its atom types and Cartesian coordinates. Every conformer has a potential energy, and the lower this energy, the more likely it occurs in nature. Most existing machine learning methods for molecular property prediction consider either 2D molecular graphs or 3D conformer structure representations in isolation. Inspired by recent work on using ensembles of conformers in conjunction with 2D graph representations, we propose $\mathrm{E}$(3)-invariant molecular conformer aggregation networks. The method integrates a molecule's 2D representation with that of multiple of its conformers. Contrary to prior work, we propose a novel 2D-3D aggregation mechanism based on a differentiable solver for the Fused Gromov-Wasserstein Barycenter problem and the use of an efficient conformer generation method based on distance geometry. We show that the proposed aggregation mechanism is $\mathrm{E}$(3) invariant and propose an efficient GPU implementation. Moreover, we demonstrate that the aggregation mechanism helps to significantly outperform state-of-the-art molecule property prediction methods on established datasets.
Related papers
- Geometry Informed Tokenization of Molecules for Language Model Generation [85.80491667588923]
We consider molecule generation in 3D space using language models (LMs)
Although tokenization of molecular graphs exists, that for 3D geometries is largely unexplored.
We propose the Geo2Seq, which converts molecular geometries into $SE(3)$-invariant 1D discrete sequences.
arXiv Detail & Related papers (2024-08-19T16:09:59Z) - Pre-training of Molecular GNNs via Conditional Boltzmann Generator [0.0]
We propose a pre-training method for molecular GNNs using an existing dataset of molecular conformations.
We show that our model has a better prediction performance for molecular properties than existing pre-training methods.
arXiv Detail & Related papers (2023-12-20T15:30:15Z) - A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining [40.02834704975147]
molecule pretraining has quickly become the go-to schema to boost the performance of AI-based drug discovery.
Here, we propose MoleculeSDE to generate the 3D reflection from 2D topologies, and vice versa, directly in the input space.
By comparing with 17 pretraining baselines, we empirically verify that MoleculeSDE can learn an expressive representation with state-of-the-art performance on 26 out of 32 downstream tasks.
arXiv Detail & Related papers (2023-05-28T15:56:02Z) - MUDiff: Unified Diffusion for Complete Molecule Generation [104.7021929437504]
We present a new model for generating a comprehensive representation of molecules, including atom features, 2D discrete molecule structures, and 3D continuous molecule coordinates.
We propose a novel graph transformer architecture to denoise the diffusion process.
Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.
arXiv Detail & Related papers (2023-04-28T04:25:57Z) - DiffBP: Generative Diffusion of 3D Molecules for Target Protein Binding [51.970607704953096]
Previous works usually generate atoms in an auto-regressive way, where element types and 3D coordinates of atoms are generated one by one.
In real-world molecular systems, the interactions among atoms in an entire molecule are global, leading to the energy function pair-coupled among atoms.
In this work, a generative diffusion model for molecular 3D structures based on target proteins is established, at a full-atom level in a non-autoregressive way.
arXiv Detail & Related papers (2022-11-21T07:02:15Z) - MolNet: A Chemically Intuitive Graph Neural Network for Prediction of
Molecular Properties [1.231476564107544]
graph neural network (GNN) has been a powerful deep-learning tool in chemistry domain.
MolNet model is chemically intuitive, accommodating the 3D non-bond information in a molecule.
MolNet gives a state-of-the-art performance in the classification task of BACE dataset and regression task of ESOL dataset.
arXiv Detail & Related papers (2022-02-01T20:47:28Z) - Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular
Graphs [79.06686274377009]
We develop a benchmark, known as Molecule3D, that includes a dataset with precise ground-state geometries of approximately 4 million molecules.
We implement two baseline methods that either predict the pairwise distance between atoms or atom coordinates in 3D space.
Our method can achieve comparable prediction accuracy but with much smaller computational costs.
arXiv Detail & Related papers (2021-09-30T22:09:28Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z) - Learning a Continuous Representation of 3D Molecular Structures with
Deep Generative Models [0.0]
Generative models are an entirely different approach that learn to represent and optimize molecules in a continuous latent space.
We describe deep generative models of three dimensional molecular structures using atomic density grids.
We are also able to sample diverse sets of molecules based on a given input compound to increase the probability of creating valid, drug-like molecules.
arXiv Detail & Related papers (2020-10-17T01:15:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.