Related papers: JESTR: Joint Embedding Space Technique for Ranking Candidate Molecules for the Annotation of Untargeted Metabolomics Data

Related papers

Aligned Manifold Property and Topology Point Clouds for Learning Molecular Properties [55.2480439325792]
This work introduces AMPTCR, a molecular surface representation that combines local quantum-derived scalar fields and custom topological descriptors within an aligned point cloud format.<n>For molecular weight, results confirm that AMPTCR encodes physically meaningful data, with a validation R2 of 0.87.<n>In the bacterial inhibition task, AMPTCR enables both classification and direct regression of E. coli inhibition values.
arXiv Detail & Related papers (2025-07-22T04:35:50Z)
GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition [60.76623665324548]
GTR-Mol-VLM is a novel framework featuring two key innovations.<n>It emulates human reasoning by incrementally parsing molecular graphs through sequential atom-bond predictions.<n>MolRec-Bench is the first benchmark designed for a fine-grained evaluation of graph-parsing accuracy in OCSR.
arXiv Detail & Related papers (2025-06-09T08:47:10Z)
Knowledge-aware contrastive heterogeneous molecular graph learning [77.94721384862699]
We propose a paradigm shift by encoding molecular graphs into Heterogeneous Molecular Graph Learning (KCHML) KCHML conceptualizes molecules through three distinct graph views-molecular, elemental, and pharmacological-enhanced by heterogeneous molecular graphs and a dual message-passing mechanism. This design offers a comprehensive representation for property prediction, as well as for downstream tasks such as drug-drug interaction (DDI) prediction.
arXiv Detail & Related papers (2025-02-17T11:53:58Z)
DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra [60.39311767532607]
DiffMS is a formula-restricted encoder-decoder generative network. We develop a robust decoder that bridges latent embeddings and molecular structures. Experiments show DiffMS outperforms existing models on $textitde novo$ molecule generation.
arXiv Detail & Related papers (2025-02-13T18:29:48Z)
Graph-based Molecular In-context Learning Grounded on Morgan Fingerprints [28.262593876388397]
In-context learning (ICL) conditions large language models (LLMs) for molecular tasks, such as property prediction and molecule captioning, by embedding carefully selected demonstration examples into the input prompt. However, current prompt retrieval methods for molecular tasks have relied on molecule feature similarity, such as Morgan fingerprints, which do not adequately capture the global molecular and atom-binding relationships. We propose a self-supervised learning technique, GAMIC, which aligns global molecular structures, represented by graph neural networks (GNNs), with textual captions (descriptions) while leveraging local feature similarity through Morgan fingerprints.
arXiv Detail & Related papers (2025-02-08T02:46:33Z)
SpecTUS: Spectral Translator for Unknown Structures annotation from EI-MS spectra [0.0]
We propose SpecTUS: Spectral Translator for Unknown Structures, a deep neural model that addresses the task of structural annotation of small molecules. Our model analyzes the spectra in structuralittextde novo manner -- a direct translation from the spectra into 2D representation. In a rigorous evaluation of our model on the novel structure annotation task across different libraries, we outperformed standard database search techniques by a wide margin.
arXiv Detail & Related papers (2025-02-07T17:36:13Z)
Beyond Message Passing: Neural Graph Pattern Machine [50.78679002846741]
We introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures.<n>GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies.
arXiv Detail & Related papers (2025-01-30T20:37:47Z)
MADGEN: Mass-Spec attends to De Novo Molecular generation [16.89017809745962]
We propose a scaffold-based method for de novo molecular structure generation guided by mass spectrometry data. MADGEN operates in two stages: scaffold retrieval and spectra-conditioned molecular generation. We evaluate MADGEN on three datasets (NIST23, CANOPUS, and MassSpecGym)
arXiv Detail & Related papers (2025-01-03T18:54:26Z)
FARM: Functional Group-Aware Representations for Small Molecules [55.281754551202326]
We introduce Functional Group-Aware Representations for Small Molecules (FARM) FARM is a foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs. We rigorously evaluate FARM on the MoleculeNet dataset, where it achieves state-of-the-art performance on 10 out of 12 tasks.
arXiv Detail & Related papers (2024-10-02T23:04:58Z)
Graph Sampling-based Meta-Learning for Molecular Property Prediction [25.193408898790796]
We propose a Graph Sampling-based Meta-learning framework for few-shot molecular property prediction. We show GS-Meta consistently outperforms state-of-the-art methods by 5.71%-6.93% in ROC-AUC.
arXiv Detail & Related papers (2023-06-29T08:34:01Z)
Mass Spectra Prediction with Structural Motif-based Graph Neural Networks [21.71309513265843]
MoMS-Net is a system that predicts mass spectra using the information derived from structural motifs and the implementation of Graph Neural Networks (GNNs) We have tested our model across diverse mass spectra and have observed its superiority over other existing models.
arXiv Detail & Related papers (2023-06-28T10:33:57Z)
Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules. By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures. When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z)
Atomic and Subgraph-aware Bilateral Aggregation for Molecular Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA) ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information. Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z)
t-SMILES: A Scalable Fragment-based Molecular Representation Framework for De Novo Molecule Generation [9.116670221263753]
This study introduces a flexible, fragment-based, multiscale molecular representation framework called t-SMILES. It describes molecules using SMILES-type strings obtained by performing a breadth-first search on a full binary tree formed from a fragmented molecular graph. It significantly outperforms classical SMILES, DeepSMILES, SELFIES and baseline models in goal-directed tasks.
arXiv Detail & Related papers (2023-01-04T21:41:01Z)
Graph neural networks for the prediction of molecular structure-property relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph. GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors. We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z)
Ensemble Spectral Prediction (ESP) Model for Metabolite Annotation [10.640447979978436]
Key challenge in metabolomics is annotating measured spectra from a biological sample with chemical identities. We propose a novel machine learning model, Ensemble Spectral Prediction (ESP), for metabolite annotation.
arXiv Detail & Related papers (2022-03-25T17:05:41Z)
Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet Transmission Spectra [68.8204255655161]
We focus on unsupervised techniques for analyzing spectral data from transiting exoplanets. We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations. We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes.
arXiv Detail & Related papers (2022-01-07T22:26:33Z)
MassFormer: Tandem Mass Spectrum Prediction for Small Molecules using Graph Transformers [3.2951121243459522]
Tandem mass spectra capture fragmentation patterns that provide key structural information about a molecule. For over seventy years, spectrum prediction has remained a key challenge in the field. We propose a new model, MassFormer, for accurately predicting tandem mass spectra.
arXiv Detail & Related papers (2021-11-08T20:55:15Z)
Structure-Enhanced Meta-Learning For Few-Shot Graph Classification [53.54066611743269]
This work explores the potential of metric-based meta-learning for solving few-shot graph classification. An implementation upon GIN, named SMFGIN, is tested on two datasets, Chembl and TRIANGLES.
arXiv Detail & Related papers (2021-03-05T09:03:03Z)
Distance-aware Molecule Graph Attention Network for Drug-Target Binding Affinity Prediction [54.93890176891602]
We propose a diStance-aware Molecule graph Attention Network (S-MAN) tailored to drug-target binding affinity prediction. As a dedicated solution, we first propose a position encoding mechanism to integrate the topological structure and spatial position information into the constructed pocket-ligand graph. We also propose a novel edge-node hierarchical attentive aggregation structure which has edge-level aggregation and node-level aggregation.
arXiv Detail & Related papers (2020-12-17T17:44:01Z)
A Systematic Approach to Featurization for Cancer Drug Sensitivity Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques. We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.