Molecule Joint Auto-Encoding: Trajectory Pretraining with 2D and 3D
Diffusion
- URL: http://arxiv.org/abs/2312.03475v1
- Date: Wed, 6 Dec 2023 12:58:37 GMT
- Title: Molecule Joint Auto-Encoding: Trajectory Pretraining with 2D and 3D
Diffusion
- Authors: Weitao Du, Jiujiu Chen, Xuecang Zhang, Zhiming Ma, Shengchao Liu
- Abstract summary: We propose a pretraining method for molecule joint auto-encoding (MoleculeJAE)
MoleculeJAE can learn both the 2D bond (topology) and 3D conformation (geometry) information.
Empirically, MoleculeJAE proves its effectiveness by reaching state-of-the-art performance on 15 out of 20 tasks.
- Score: 19.151643496588022
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, artificial intelligence for drug discovery has raised increasing
interest in both machine learning and chemistry domains. The fundamental
building block for drug discovery is molecule geometry and thus, the molecule's
geometrical representation is the main bottleneck to better utilize machine
learning techniques for drug discovery. In this work, we propose a pretraining
method for molecule joint auto-encoding (MoleculeJAE). MoleculeJAE can learn
both the 2D bond (topology) and 3D conformation (geometry) information, and a
diffusion process model is applied to mimic the augmented trajectories of such
two modalities, based on which, MoleculeJAE will learn the inherent chemical
structure in a self-supervised manner. Thus, the pretrained geometrical
representation in MoleculeJAE is expected to benefit downstream
geometry-related tasks. Empirically, MoleculeJAE proves its effectiveness by
reaching state-of-the-art performance on 15 out of 20 tasks by comparing it
with 12 competitive baselines.
Related papers
- Medication Recommendation via Dual Molecular Modalities and Multi-Step Enhancement [6.927266015351967]
Existing works based on molecular knowledge neglect the 3D geometric structure of molecules and fail to learn the high-dimensional information of medications.
We propose a bimodal molecular recommendation framework named BiMoRec, which introduces 3D molecular structures to obtain atomic 3D coordinates and edge indices.
arXiv Detail & Related papers (2024-05-30T07:13:08Z) - UniIF: Unified Molecule Inverse Folding [67.60267592514381]
We propose a unified model UniIF for inverse folding of all molecules.
Our proposed method surpasses state-of-the-art methods on all tasks.
arXiv Detail & Related papers (2024-05-29T10:26:16Z) - DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design [62.68420322996345]
Existing structured-based drug design methods treat all ligand atoms equally.
We propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold.
Our approach achieves state-of-the-art performance in generating high-affinity molecules.
arXiv Detail & Related papers (2024-02-26T05:21:21Z) - Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks [44.934084652800976]
We introduce the first MoleculAR Conformer Ensemble Learning benchmark to thoroughly evaluate the potential of learning on conformer ensembles.
Our findings reveal that direct learning from an conformer space can improve performance on a variety of tasks and models.
arXiv Detail & Related papers (2023-09-29T20:06:46Z) - A Group Symmetric Stochastic Differential Equation Model for Molecule
Multi-modal Pretraining [36.48602272037559]
molecule pretraining has quickly become the go-to schema to boost the performance of AI-based drug discovery.
Here, we propose MoleculeSDE to generate the 3D reflection from 2D topologies, and vice versa, directly in the input space.
By comparing with 17 pretraining baselines, we empirically verify that MoleculeSDE can learn an expressive representation with state-of-the-art performance on 26 out of 32 downstream tasks.
arXiv Detail & Related papers (2023-05-28T15:56:02Z) - MUDiff: Unified Diffusion for Complete Molecule Generation [104.7021929437504]
We present a new model for generating a comprehensive representation of molecules, including atom features, 2D discrete molecule structures, and 3D continuous molecule coordinates.
We propose a novel graph transformer architecture to denoise the diffusion process.
Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.
arXiv Detail & Related papers (2023-04-28T04:25:57Z) - An Equivariant Generative Framework for Molecular Graph-Structure
Co-Design [54.92529253182004]
We present MolCode, a machine learning-based generative framework for underlineMolecular graph-structure underlineCo-design.
In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure.
Our investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design.
arXiv Detail & Related papers (2023-04-12T13:34:22Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Improving Molecular Pretraining with Complementary Featurizations [20.86159731100242]
Molecular pretraining is a paradigm to solve a variety of tasks in computational chemistry and drug discovery.
We show that different featurization techniques convey chemical information differently.
We propose a simple and effective MOlecular pretraining framework with COmplementary featurizations (MOCO)
arXiv Detail & Related papers (2022-09-29T21:11:09Z) - Scalable Fragment-Based 3D Molecular Design with Reinforcement Learning [68.8204255655161]
We introduce a novel framework for scalable 3D design that uses a hierarchical agent to build molecules.
In a variety of experiments, we show that our agent, guided only by energy considerations, can efficiently learn to produce molecules with over 100 atoms.
arXiv Detail & Related papers (2022-02-01T18:54:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.