Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance
Matching
- URL: http://arxiv.org/abs/2206.13602v1
- Date: Mon, 27 Jun 2022 19:30:53 GMT
- Title: Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance
Matching
- Authors: Shengchao Liu, Hongyu Guo, Jian Tang
- Abstract summary: The power of pretraining on 3D geometric structures has been less explored.
We propose a 3D coordinate denoising pretraining framework to model such an energy landscape.
Our experiments confirm the effectiveness and robustness of our proposed method.
- Score: 36.92992265307818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretraining molecular representations is critical in a variety of
applications in drug and material discovery due to the limited number of
labeled molecules, yet most of existing work focuses on pretraining on 2D
molecular graphs. The power of pretraining on 3D geometric structures, however,
has been less explored, owning to the difficulty of finding a sufficient proxy
task to empower the pretraining to effectively extract essential features from
the geometric structures. Motivated by the dynamic nature of 3D molecules,
where the continuous motion of a molecule in the 3D Euclidean space forms a
smooth potential energy surface, we propose a 3D coordinate denoising
pretraining framework to model such an energy landscape. Leveraging a
SE(3)-invariant score matching method, we propose SE(3)-DDM where the
coordinate denoising proxy task is effectively boiled down to the denoising of
the pairwise atomic distances in a molecule. Our comprehensive experiments
confirm the effectiveness and robustness of our proposed method.
Related papers
- 3D Denoisers are Good 2D Teachers: Molecular Pretraining via Denoising
and Cross-Modal Distillation [65.35632020653291]
We propose D&D, a self-supervised molecular representation learning framework that pretrains a 2D graph encoder by distilling representations from a 3D denoiser.
We show that D&D can infer 3D information based on the 2D graph and shows superior performance and label-efficiency against other baselines.
arXiv Detail & Related papers (2023-09-08T01:36:58Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - A Group Symmetric Stochastic Differential Equation Model for Molecule
Multi-modal Pretraining [36.48602272037559]
molecule pretraining has quickly become the go-to schema to boost the performance of AI-based drug discovery.
Here, we propose MoleculeSDE to generate the 3D reflection from 2D topologies, and vice versa, directly in the input space.
By comparing with 17 pretraining baselines, we empirically verify that MoleculeSDE can learn an expressive representation with state-of-the-art performance on 26 out of 32 downstream tasks.
arXiv Detail & Related papers (2023-05-28T15:56:02Z) - 3D Equivariant Diffusion for Target-Aware Molecule Generation and
Affinity Prediction [9.67574543046801]
The inclusion of 3D structures during targeted drug design shows superior performance to other target-free models.
We develop a 3D equivariant diffusion model to solve the above challenges.
Our model could generate molecules with more realistic 3D structures and better affinities towards the protein targets, and improve binding affinity ranking and prediction without retraining.
arXiv Detail & Related papers (2023-03-06T23:01:43Z) - Geometry-Complete Diffusion for 3D Molecule Generation and Optimization [3.8366697175402225]
We introduce the Geometry-Complete Diffusion Model (GCDM) for 3D molecule generation.
GCDM outperforms existing 3D molecular diffusion models by significant margins across conditional and unconditional settings.
We also show that GCDM's geometric features can be repurposed to consistently optimize the geometry and chemical composition of existing 3D molecules.
arXiv Detail & Related papers (2023-02-08T20:01:51Z) - 3D Equivariant Molecular Graph Pretraining [42.957880677779556]
We tackle 3D molecular pretraining in a complete and novel sense.
We first propose to adopt an equivariant energy-based model as the backbone for pretraining, which enjoys the merit of fulfilling the symmetry of 3D space.
We evaluate our model pretrained from a large-scale 3D dataset GEOM-QM9 on two challenging 3D benchmarks: MD17 and QM9.
arXiv Detail & Related papers (2022-07-18T16:26:24Z) - Pre-training via Denoising for Molecular Property Prediction [53.409242538744444]
We describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium.
Inspired by recent advances in noise regularization, our pre-training objective is based on denoising.
arXiv Detail & Related papers (2022-05-31T22:28:34Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z) - An End-to-End Framework for Molecular Conformation Generation via
Bilevel Programming [71.82571553927619]
We propose an end-to-end solution for molecular conformation prediction called ConfVAE.
Specifically, the molecular graph is first encoded in a latent space, and then the 3D structures are generated by solving a principled bilevel optimization program.
arXiv Detail & Related papers (2021-05-15T15:22:29Z) - Investigating 3D Atomic Environments for Enhanced QSAR [0.0]
Predicting bioactivity and physical properties of molecules is a longstanding challenge in drug design.
Most approaches use molecular descriptors based on a 2D representation of molecules as a graph of atoms and bonds, abstracting away the molecular shape.
We describe a novel alignment-free 3D QSAR method using Smooth Overlap of Atomic Positions (SOAP), a well-established formalism developed for interpolating potential energy surfaces.
arXiv Detail & Related papers (2020-10-24T10:04:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.