Reflection-Equivariant Diffusion for 3D Structure Determination from
Isotopologue Rotational Spectra in Natural Abundance
- URL: http://arxiv.org/abs/2310.11609v2
- Date: Sun, 19 Nov 2023 22:53:59 GMT
- Title: Reflection-Equivariant Diffusion for 3D Structure Determination from
Isotopologue Rotational Spectra in Natural Abundance
- Authors: Austin Cheng, Alston Lo, Santiago Miret, Brooks Pate, Al\'an
Aspuru-Guzik
- Abstract summary: We develop KREED, a generative diffusion model that infers a molecule's complete 3D structure from its molecular formula, moments of inertia, and unsigned substitution coordinates of heavy atoms.
KREED's top-1 predictions identify the correct 3D structure with >98% accuracy on the QM9 and GEOM datasets.
On a test set of experimentally measured substitution coordinates gathered from the literature, KREED predicts the correct all-atom 3D structure in 25 of 33 cases.
- Score: 5.585345112578967
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Structure determination is necessary to identify unknown organic molecules,
such as those in natural products, forensic samples, the interstellar medium,
and laboratory syntheses. Rotational spectroscopy enables structure
determination by providing accurate 3D information about small organic
molecules via their moments of inertia. Using these moments, Kraitchman
analysis determines isotopic substitution coordinates, which are the unsigned
$|x|,|y|,|z|$ coordinates of all atoms with natural isotopic abundance,
including carbon, nitrogen, and oxygen. While unsigned substitution coordinates
can verify guesses of structures, the missing $+/-$ signs make it challenging
to determine the actual structure from the substitution coordinates alone. To
tackle this inverse problem, we develop KREED (Kraitchman
REflection-Equivariant Diffusion), a generative diffusion model that infers a
molecule's complete 3D structure from its molecular formula, moments of
inertia, and unsigned substitution coordinates of heavy atoms. KREED's top-1
predictions identify the correct 3D structure with >98% accuracy on the QM9 and
GEOM datasets when provided with substitution coordinates of all heavy atoms
with natural isotopic abundance. When substitution coordinates are restricted
to only a subset of carbons, accuracy is retained at 91% on QM9 and 32% on
GEOM. On a test set of experimentally measured substitution coordinates
gathered from the literature, KREED predicts the correct all-atom 3D structure
in 25 of 33 cases, demonstrating experimental applicability for context-free 3D
structure determination with rotational spectroscopy.
Related papers
- Enhancing Retrosynthesis with Conformer: A Template-Free Method [2.990854929039588]
Retrosynthesis plays a crucial role in the fields of organic synthesis and drug development.
We introduce a novel transformer-based, template-free approach that incorporates 3D conformer data and spatial information.
Our approach includes an Atom-align Fusion module that integrates 3D positional data at the input stage.
arXiv Detail & Related papers (2025-01-21T18:54:16Z) - Stiefel Flow Matching for Moment-Constrained Structure Elucidation [6.111688279277978]
We consider the task of predicting a molecule's all-atom 3D structure given only its molecular formula and moments of inertia.
Existing generative models can conditionally sample 3D structures with approximately correct moments.
We propose Stiefel Flow Matching as a generative model for elucidating 3D structure under exact moment constraints.
arXiv Detail & Related papers (2024-12-17T05:07:10Z) - Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Automated 3D Pre-Training for Molecular Property Prediction [54.15788181794094]
We propose a novel 3D pre-training framework (dubbed 3D PGT)
It pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures.
Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT.
arXiv Detail & Related papers (2023-06-13T14:43:13Z) - Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration [63.23362798102195]
We propose D3FG, a functional-group-based diffusion model for pocket-specific molecule generation and elaboration.
D3FG decomposes molecules into two categories of components: functional groups defined as rigid bodies and linkers as mass points.
In the experiments, our method can generate molecules with more realistic 3D structures, competitive affinities toward the protein targets, and better drug properties.
arXiv Detail & Related papers (2023-05-30T06:41:20Z) - MUDiff: Unified Diffusion for Complete Molecule Generation [104.7021929437504]
We present a new model for generating a comprehensive representation of molecules, including atom features, 2D discrete molecule structures, and 3D continuous molecule coordinates.
We propose a novel graph transformer architecture to denoise the diffusion process.
Our model is a promising approach for designing stable and diverse molecules and can be applied to a wide range of tasks in molecular modeling.
arXiv Detail & Related papers (2023-04-28T04:25:57Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Heterogeneous reconstruction of deformable atomic models in Cryo-EM [30.864688165021054]
We describe a heterogeneous reconstruction method based on an atomistic representation whose deformation is reduced to a handful of collective motions.
We show for each distribution that our approach is able to recapitulate the intermediate atomic models with atomic-level accuracy.
arXiv Detail & Related papers (2022-09-29T22:35:35Z) - Learning 3D Representations of Molecular Chirality with Invariance to
Bond Rotations [2.17167311150369]
We design an SE(3)-invariant model that processes torsion angles of a 3D molecular conformer.
We test our model on four benchmarks: contrastive learning to distinguish conformers of different stereoisomers in a learned latent space, classification of chiral centers as R/S, prediction of how enantiomers rotate circularly polarized light, and ranking enantiomers by their docking scores in an enantiosensitive protein pocket.
arXiv Detail & Related papers (2021-10-08T21:25:47Z) - GeoMol: Torsional Geometric Generation of Molecular 3D Conformer
Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.
Existing generative models have several drawbacks including lack of modeling important molecular geometry elements.
We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z) - Capturing 3D atomic defects and phonon localization at the 2D
heterostructure interface [3.4654210770666376]
We determine the 3D local atomic positions at the interface of a MoS2-WSe2 heterojunction with picometer precision.
We observe point defects, bond distortion, atomic-scale ripples and measure the full 3D strain tensor at the heterointerface.
arXiv Detail & Related papers (2021-04-18T23:42:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.