Predicting pathways for old and new metabolites through clustering
- URL: http://arxiv.org/abs/2211.15720v1
- Date: Mon, 28 Nov 2022 19:07:02 GMT
- Title: Predicting pathways for old and new metabolites through clustering
- Authors: Thiru Siddharth, Nathan Lewis
- Abstract summary: We present an approach to identify pathways based on metabolite structure.
After applying clustering algorithms to both groups of features, we found the clusters accurately linked 92% of known metabolites to their respective pathways.
- Score: 0.06091702876917279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The diverse metabolic pathways are fundamental to all living organisms, as
they harvest energy, synthesize biomass components, produce molecules to
interact with the microenvironment, and neutralize toxins. While discovery of
new metabolites and pathways continues, the prediction of pathways for new
metabolites can be challenging. It can take vast amounts of time to elucidate
pathways for new metabolites; thus, according to HMDB only 60% of metabolites
get assigned to pathways. Here, we present an approach to identify pathways
based on metabolite structure. We extracted 201 features from SMILES
annotations, and identified new metabolites from PubMed abstracts and HMDB.
After applying clustering algorithms to both groups of features, we quantified
correlations between metabolites, and found the clusters accurately linked 92%
of known metabolites to their respective pathways. Thus, this approach could be
valuable for predicting metabolic pathways for new metabolites.
Related papers
- Gene-Metabolite Association Prediction with Interactive Knowledge Transfer Enhanced Graph for Metabolite Production [49.814615043389864]
We propose a new task, Gene-Metabolite Association Prediction based on metabolic graphs.
We present the first benchmark containing 2474 metabolites and 1947 genes of two commonly used microorganisms.
Our proposed methodology outperforms baselines by up to 12.3% across various link prediction frameworks.
arXiv Detail & Related papers (2024-10-24T06:54:27Z) - Incorporating Metabolic Information into LLMs for Anomaly Detection in Clinical Time-Series [0.4779196219827506]
We introduce the Metabolism Pathway-driven Prompting (MPP) method, which integrates the information about metabolic pathways to better capture the structural and temporal changes in biological samples.
We applied our method for doping detection in sports, focusing on steroid metabolism, and evaluated using real-world data from athletes.
arXiv Detail & Related papers (2024-10-02T14:05:21Z) - A generalizable framework for unlocking missing reactions in genome-scale metabolic networks using deep learning [3.765163284974983]
CLOSEgaps is a deep learning tool that maps metabolic networks as hypergraphs and learns their hyper-topology features to identify missing reactions and gaps.
Results demonstrate that CLOSEgaps accurately gap-filling over 96% of artificially introduced gaps for various GEMs.
arXiv Detail & Related papers (2024-09-20T06:47:44Z) - Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioEL provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction.
We show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.
arXiv Detail & Related papers (2023-12-15T14:04:23Z) - Transition Path Sampling with Boltzmann Generator-based MCMC Moves [49.69940954060636]
Current approaches to sample transition paths use Markov chain Monte Carlo and rely on time-intensive molecular dynamics simulations to find new paths.
Our approach operates in the latent space of a normalizing flow that maps from the molecule's Boltzmann distribution to a Gaussian, where we propose new paths without requiring molecular simulations.
arXiv Detail & Related papers (2023-12-08T20:05:33Z) - Multi-View Variational Autoencoder for Missing Value Imputation in
Untargeted Metabolomics [17.563099908890013]
We propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites.
By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values.
arXiv Detail & Related papers (2023-10-12T02:34:56Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - CogMol: Target-Specific and Selective Drug Design for COVID-19 Using
Deep Generative Models [74.58583689523999]
We propose an end-to-end framework, named CogMol, for designing new drug-like small molecules targeting novel viral proteins.
CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme.
CogMol handles multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity.
arXiv Detail & Related papers (2020-04-02T18:17:20Z) - A machine learning approach to investigate regulatory control circuits
in bacterial metabolic pathways [0.0]
In this work, a machine learning approach for identifying the multi-omics metabolic regulatory control circuits inside the pathways is described.
The identification of bacterial metabolic pathways that are more regulated than others in term of their multi-omics follows from the analysis of these circuits.
arXiv Detail & Related papers (2020-01-13T11:04:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.