Ensemble Spectral Prediction (ESP) Model for Metabolite Annotation
- URL: http://arxiv.org/abs/2203.13783v1
- Date: Fri, 25 Mar 2022 17:05:41 GMT
- Title: Ensemble Spectral Prediction (ESP) Model for Metabolite Annotation
- Authors: Xinmeng Li, Hao Zhu, Li-ping Liu, Soha Hassoun
- Abstract summary: Key challenge in metabolomics is annotating measured spectra from a biological sample with chemical identities.
We propose a novel machine learning model, Ensemble Spectral Prediction (ESP), for metabolite annotation.
- Score: 10.640447979978436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A key challenge in metabolomics is annotating measured spectra from a
biological sample with chemical identities. Currently, only a small fraction of
measurements can be assigned identities. Two complementary computational
approaches have emerged to address the annotation problem: mapping candidate
molecules to spectra, and mapping query spectra to molecular candidates. In
essence, the candidate molecule with the spectrum that best explains the query
spectrum is recommended as the target molecule. Despite candidate ranking being
fundamental in both approaches, no prior works utilized rank learning tasks in
determining the target molecule. We propose a novel machine learning model,
Ensemble Spectral Prediction (ESP), for metabolite annotation. ESP takes
advantage of prior neural network-based annotation models that utilize
multilayer perceptron (MLP) networks and Graph Neural Networks (GNNs). Based on
the ranking results of the MLP and GNN-based models, ESP learns a weighting for
the outputs of MLP and GNN spectral predictors to generate a spectral
prediction for a query molecule. Importantly, training data is stratified by
molecular formula to provide candidate sets during model training. Further,
baseline MLP and GNN models are enhanced by considering peak dependencies
through multi-head attention mechanism and multi-tasking on spectral topic
distributions. ESP improves average rank by 41% and 30% over the MLP and GNN
baselines, respectively, demonstrating remarkable performance gain over
state-of-the-art neural network approaches. We show that annotation
performance, for ESP and other models, is a strong function of the number of
molecules in the candidate set and their similarity to the target molecule.
Related papers
- JESTR: Joint Embedding Space Technique for Ranking Candidate Molecules for the Annotation of Untargeted Metabolomics Data [8.964879518873591]
We introduce a novel paradigm (JESTR) for annotation.
Unlike prior approaches that explicitly construct molecular fingerprints or spectra, JESTR embeds their representations in a joint space.
We evaluate JESTR against mol-to-spec and spec-to-FP annotation tools on three datasets.
arXiv Detail & Related papers (2024-11-18T03:03:57Z) - Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - Mass Spectra Prediction with Structural Motif-based Graph Neural
Networks [21.71309513265843]
MoMS-Net is a system that predicts mass spectra using the information derived from structural motifs and the implementation of Graph Neural Networks (GNNs)
We have tested our model across diverse mass spectra and have observed its superiority over other existing models.
arXiv Detail & Related papers (2023-06-28T10:33:57Z) - Towards Predicting Equilibrium Distributions for Molecular Systems with
Deep Learning [60.02391969049972]
We introduce a novel deep learning framework, called Distributional Graphormer (DiG), in an attempt to predict the equilibrium distribution of molecular systems.
DiG employs deep neural networks to transform a simple distribution towards the equilibrium distribution, conditioned on a descriptor of a molecular system.
arXiv Detail & Related papers (2023-06-08T17:12:08Z) - Prefix-Tree Decoding for Predicting Mass Spectra from Molecules [12.868704267691125]
We use a new intermediate strategy for predicting mass spectra from molecules by treating mass spectra as sets of molecular formulae, which are themselves multisets of atoms.
We show promising empirical results on mass spectra prediction tasks.
arXiv Detail & Related papers (2023-03-11T17:44:28Z) - Specformer: Spectral Graph Neural Networks Meet Transformers [51.644312964537356]
Spectral graph neural networks (GNNs) learn graph representations via spectral-domain graph convolutions.
We introduce Specformer, which effectively encodes the set of all eigenvalues and performs self-attention in the spectral domain.
By stacking multiple Specformer layers, one can build a powerful spectral GNN.
arXiv Detail & Related papers (2023-03-02T07:36:23Z) - MolCPT: Molecule Continuous Prompt Tuning to Generalize Molecular
Representation Learning [77.31492888819935]
We propose a novel paradigm of "pre-train, prompt, fine-tune" for molecular representation learning, named molecule continuous prompt tuning (MolCPT)
MolCPT defines a motif prompting function that uses the pre-trained model to project the standalone input into an expressive prompt.
Experiments on several benchmark datasets show that MolCPT efficiently generalizes pre-trained GNNs for molecular property prediction.
arXiv Detail & Related papers (2022-12-20T19:32:30Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - Unsupervised Spectral Unmixing For Telluric Correction Using A Neural
Network Autoencoder [58.720142291102135]
We present a neural network autoencoder approach for extracting a telluric transmission spectrum from a large set of high-precision observed solar spectra from the HARPS-N radial velocity spectrograph.
arXiv Detail & Related papers (2021-11-17T12:54:48Z) - MassFormer: Tandem Mass Spectrum Prediction for Small Molecules using
Graph Transformers [3.2951121243459522]
Tandem mass spectra capture fragmentation patterns that provide key structural information about a molecule.
For over seventy years, spectrum prediction has remained a key challenge in the field.
We propose a new model, MassFormer, for accurately predicting tandem mass spectra.
arXiv Detail & Related papers (2021-11-08T20:55:15Z) - Using Graph Neural Networks for Mass Spectrometry Prediction [11.797657070243716]
We explore using graph neural networks (GNNs) to predict measured spectra.
The input to our model is a molecular graph.
We compare our results to NEIMS, a neural network model that utilizes molecular fingerprints as inputs.
arXiv Detail & Related papers (2020-10-09T16:06:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.