Carbohydrate NMR chemical shift predictions using E(3) equivariant graph
neural networks
- URL: http://arxiv.org/abs/2311.12657v1
- Date: Tue, 21 Nov 2023 15:01:14 GMT
- Title: Carbohydrate NMR chemical shift predictions using E(3) equivariant graph
neural networks
- Authors: Maria B{\aa}nkestad, Keven M. Dorst, G\"oran Widmalm, Jerk R\"onnols
- Abstract summary: This work introduces a novel approach that leverages E(3) equivariant graph neural networks to predict carbohydrate NMR spectra.
Notably, our model achieves a substantial reduction in mean absolute error, up to threefold, compared to traditional models.
The implications are far-reaching and go beyond an advanced understanding of carbohydrate structures and spectral interpretation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Carbohydrates, vital components of biological systems, are well-known for
their structural diversity. Nuclear Magnetic Resonance (NMR) spectroscopy plays
a crucial role in understanding their intricate molecular arrangements and is
essential in assessing and verifying the molecular structure of organic
molecules. An important part of this process is to predict the NMR chemical
shift from the molecular structure. This work introduces a novel approach that
leverages E(3) equivariant graph neural networks to predict carbohydrate NMR
spectra. Notably, our model achieves a substantial reduction in mean absolute
error, up to threefold, compared to traditional models that rely solely on
two-dimensional molecular structure. Even with limited data, the model excels,
highlighting its robustness and generalization capabilities. The implications
are far-reaching and go beyond an advanced understanding of carbohydrate
structures and spectral interpretation. For example, it could accelerate
research in pharmaceutical applications, biochemistry, and structural biology,
offering a faster and more reliable analysis of molecular structures.
Furthermore, our approach is a key step towards a new data-driven era in
spectroscopy, potentially influencing spectroscopic techniques beyond NMR.
Related papers
- How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning? [51.286853421822705]
Large language models (LLMs) have shown promise for reasoning-intensive scientific tasks, but their capability for chemical interpretation is still unclear.<n>We introduce a Chain-of-Thought (CoT) prompting framework and benchmark that evaluate how LLMs reason about mass spectral data to predict molecular structures.<n>Our evaluation across metrics of SMILES validity, formula consistency, and structural similarity reveals that while LLMs can produce syntactically valid and partially plausible structures, they fail to achieve chemical accuracy or link reasoning to correct molecular predictions.
arXiv Detail & Related papers (2026-01-09T20:08:42Z) - NMIRacle: Multi-modal Generative Molecular Elucidation from IR and NMR Spectra [13.594833907772783]
We introduce NMIRacle, a two-stage generative framework that builds upon recent paradigms in AI-driven spectroscopy with minimal assumptions.<n>In the first stage, NMIRacle learns to reconstruct molecular structures from count-aware fragment encodings.<n>In the second stage, a spectral encoder maps input spectroscopic measurements into a latent embedding.<n>This formulation bridges fragment-level chemical modeling with spectral evidence, yielding accurate molecular predictions.
arXiv Detail & Related papers (2025-12-17T10:29:39Z) - Atomic Diffusion Models for Small Molecule Structure Elucidation from NMR Spectra [5.818797900550866]
ChefNMR (CHemical Elucidation From NMR) is an end-to-end framework that directly predicts an unknown molecule's structure.<n>To model the complex chemical groups found in natural products, we generated a dataset of simulated 1D NMR spectra for over 111,000 natural products.<n>ChefNMR predicts the structures of challenging natural product compounds with an unsurpassed accuracy of over 65%.
arXiv Detail & Related papers (2025-12-02T18:59:13Z) - Mamba-driven multi-perspective structural understanding for molecular ground-state conformation prediction [69.32436472760712]
We propose an approach of Mamba-driven multi-perspective structural understanding (MPSU-Mamba) to localize molecular ground-state conformation.<n>For complex and diverse molecules, three different kinds of dedicated scanning strategies are explored to construct a comprehensive perception of corresponding molecular structures.<n> Experimental results on QM9 and Molecule3D datasets indicate that MPSU-Mamba significantly outperforms existing methods.
arXiv Detail & Related papers (2025-11-10T11:18:32Z) - NMR-Solver: Automated Structure Elucidation via Large-Scale Spectral Matching and Physics-Guided Fragment Optimization [24.714189961887215]
Nuclear Magnetic Resonance (NMR) spectroscopy is one of the most powerful and widely used tools for molecular structure elucidation in organic chemistry.<n>Here, we present NMR-r, a practical and interpretable framework for the automated determination of small organic molecule structures from $1$H and $13$C NMR spectra.
arXiv Detail & Related papers (2025-08-30T23:59:12Z) - DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models [66.41802970528133]
Molecular structure elucidation from spectra is a foundational problem in chemistry.<n>Traditional methods rely heavily on expert interpretation and lack scalability.<n>We present DiffSpectra, a generative framework that directly infers both 2D and 3D molecular structures from multi-modal spectral data.
arXiv Detail & Related papers (2025-07-09T13:57:20Z) - MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra [48.52871465095181]
We propose to utilize the energy spectra to enhance the pre-training of 3D molecular representations (MolSpectra)
Specifically, we propose SpecFormer, a multi-spectrum encoder for encoding molecular spectra via masked patch reconstruction.
By further aligning outputs from the 3D encoder and spectrum encoder using a contrastive objective, we enhance the 3D encoder's understanding of molecules.
arXiv Detail & Related papers (2025-02-22T16:34:32Z) - Knowledge-aware contrastive heterogeneous molecular graph learning [77.94721384862699]
We propose a paradigm shift by encoding molecular graphs into Heterogeneous Molecular Graph Learning (KCHML)
KCHML conceptualizes molecules through three distinct graph views-molecular, elemental, and pharmacological-enhanced by heterogeneous molecular graphs and a dual message-passing mechanism.
This design offers a comprehensive representation for property prediction, as well as for downstream tasks such as drug-drug interaction (DDI) prediction.
arXiv Detail & Related papers (2025-02-17T11:53:58Z) - FARM: Functional Group-Aware Representations for Small Molecules [55.281754551202326]
We introduce Functional Group-Aware Representations for Small Molecules (FARM)
FARM is a foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs.
We rigorously evaluate FARM on the MoleculeNet dataset, where it achieves state-of-the-art performance on 10 out of 12 tasks.
arXiv Detail & Related papers (2024-10-02T23:04:58Z) - Accurate and efficient structure elucidation from routine one-dimensional NMR spectra using multitask machine learning [1.2754578699685275]
We introduce a machine learning framework that predicts the molecular structure of an unknown compound based on its 1D 1H and/or 13C NMR spectra.
Integrating this capability with a convolutional neural network (CNN), we build an end-to-end model for predicting structure from spectra that is fast and accurate.
arXiv Detail & Related papers (2024-08-15T17:37:36Z) - Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for Chemistry [0.1747623282473278]
This dataset comprises simulated $1$H-NMR, $13$C-NMR, HSQC-NMR, Infrared, and Mass spectra for 790k molecules extracted from chemical reactions in patent data.
We provide benchmarks for evaluating single-modality tasks such as structure elucidation, predicting the spectra for a target molecule, and functional group predictions.
arXiv Detail & Related papers (2024-07-04T12:52:48Z) - SE3Set: Harnessing equivariant hypergraph neural networks for molecular representation learning [27.713870291922333]
We develop an SE(3) equivariant hypergraph neural network architecture tailored for advanced molecular representation learning.
SE3Set has shown performance on par with state-of-the-art (SOTA) models for small molecule datasets.
It excels on the MD22 dataset, achieving a notable improvement of approximately 20% in accuracy across all molecules.
arXiv Detail & Related papers (2024-05-26T10:43:16Z) - AI-enabled prediction of NMR spectroscopy: Deducing 2-D NMR of carbohydrate [7.470166291890153]
AI-driven NMR prediction, powered by advanced machine learning and predictive algorithms, has fundamentally reshaped the interpretation of NMR spectra.
Our methodology is versatile, catering to both monosaccharide-derived small molecules, oligosaccharides and large polysaccharides.
Given the complex nature involved in the generation of 2D NMRs, our objective is to fully leverage the potential of AI to enhance the precision, efficiency, and comprehensibility of NMR spectral analysis.
arXiv Detail & Related papers (2024-03-17T21:52:51Z) - Infrared Spectra Prediction for Diazo Groups Utilizing a Machine
Learning Approach with Structural Attention Mechanism [0.0]
Infrared (IR) spectroscopy is a pivotal technique in chemical research for elucidating molecular structures and dynamics through vibrational and rotational transitions.
Here, we present a machine learning approach employing a Structural Attention Mechanism tailored to enhance the prediction and interpretation of infrared spectra, particularly for diazo compounds.
arXiv Detail & Related papers (2024-02-05T15:44:43Z) - Towards Predicting Equilibrium Distributions for Molecular Systems with
Deep Learning [60.02391969049972]
We introduce a novel deep learning framework, called Distributional Graphormer (DiG), in an attempt to predict the equilibrium distribution of molecular systems.
DiG employs deep neural networks to transform a simple distribution towards the equilibrium distribution, conditioned on a descriptor of a molecular system.
arXiv Detail & Related papers (2023-06-08T17:12:08Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Graph-based Molecular Representation Learning [59.06193431883431]
Molecular representation learning (MRL) is a key step to build the connection between machine learning and chemical science.
Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning.
arXiv Detail & Related papers (2022-07-08T17:43:20Z) - Multi-View Graph Neural Networks for Molecular Property Prediction [67.54644592806876]
We present Multi-View Graph Neural Network (MV-GNN), a multi-view message passing architecture.
In MV-GNN, we introduce a shared self-attentive readout component and disagreement loss to stabilize the training process.
We further boost the expressive power of MV-GNN by proposing a cross-dependent message passing scheme.
arXiv Detail & Related papers (2020-05-17T04:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.