How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning?
- URL: http://arxiv.org/abs/2601.06289v1
- Date: Fri, 09 Jan 2026 20:08:42 GMT
- Title: How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning?
- Authors: Yufeng Wang, Lu Wei, Lin Liu, Hao Xu, Haibin Ling,
- Abstract summary: Large language models (LLMs) have shown promise for reasoning-intensive scientific tasks, but their capability for chemical interpretation is still unclear.<n>We introduce a Chain-of-Thought (CoT) prompting framework and benchmark that evaluate how LLMs reason about mass spectral data to predict molecular structures.<n>Our evaluation across metrics of SMILES validity, formula consistency, and structural similarity reveals that while LLMs can produce syntactically valid and partially plausible structures, they fail to achieve chemical accuracy or link reasoning to correct molecular predictions.
- Score: 51.286853421822705
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Mass spectrometry (MS) is a powerful analytical technique for identifying small molecules, yet determining complete molecular structures directly from tandem mass spectra (MS/MS) remains a long-standing challenge due to complex fragmentation patterns and the vast diversity of chemical space. Recent progress in large language models (LLMs) has shown promise for reasoning-intensive scientific tasks, but their capability for chemical interpretation is still unclear. In this work, we introduce a Chain-of-Thought (CoT) prompting framework and benchmark that evaluate how LLMs reason about mass spectral data to predict molecular structures. We formalize expert chemists' reasoning steps-such as double bond equivalent (DBE) analysis, neutral loss identification, and fragment assembly-into structured prompts and assess multiple state-of-the-art LLMs (Claude-3.5-Sonnet, GPT-4o-mini, and Llama-3 series) in a zero-shot setting using the MassSpecGym dataset. Our evaluation across metrics of SMILES validity, formula consistency, and structural similarity reveals that while LLMs can produce syntactically valid and partially plausible structures, they fail to achieve chemical accuracy or link reasoning to correct molecular predictions. These findings highlight both the interpretive potential and the current limitations of LLM-based reasoning for molecular elucidation, providing a foundation for future work that combines domain knowledge and reinforcement learning to achieve chemically grounded AI reasoning.
Related papers
- Knowledge-Augmented Long-CoT Generation for Complex Biomolecular Reasoning [51.673503054645415]
Biomolecular mechanisms require multi-step reasoning across molecular interactions, signaling cascades, and metabolic pathways.<n>Existing approaches often exacerbate these issues: reasoning steps may deviate from biological facts or fail to capture long mechanistic dependencies.<n>We propose a Knowledge-Augmented Long-CoT Reasoning framework that integrates LLMs with knowledge graph-based multi-hop reasoning chains.
arXiv Detail & Related papers (2025-11-11T09:26:32Z) - Mamba-driven multi-perspective structural understanding for molecular ground-state conformation prediction [69.32436472760712]
We propose an approach of Mamba-driven multi-perspective structural understanding (MPSU-Mamba) to localize molecular ground-state conformation.<n>For complex and diverse molecules, three different kinds of dedicated scanning strategies are explored to construct a comprehensive perception of corresponding molecular structures.<n> Experimental results on QM9 and Molecule3D datasets indicate that MPSU-Mamba significantly outperforms existing methods.
arXiv Detail & Related papers (2025-11-10T11:18:32Z) - MolSpectLLM: A Molecular Foundation Model Bridging Spectroscopy, Molecule Elucidation, and 3D Structure Generation [24.14024904556376]
MolSpectLLM is a molecular foundation model pretrained on Qwen2.5-7B that unifies experimental spectroscopy with molecular 3D structure.<n>MolSpectLLM achieves state-of-the-art performance on spectrum-related tasks, with an average accuracy of 0.53 across NMR, IR, and MS benchmarks.
arXiv Detail & Related papers (2025-09-26T04:37:44Z) - $\text{M}^{2}$LLM: Multi-view Molecular Representation Learning with Large Language Models [59.125833618091846]
We propose a multi-view framework that integrates three perspectives: the molecular structure view, the molecular task view, and the molecular rules view.<n>Experiments demonstrate that $textM2$LLM achieves state-of-the-art performance on multiple benchmarks across classification and regression tasks.
arXiv Detail & Related papers (2025-08-12T05:46:47Z) - MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs [30.030008221150407]
MolReasoner is a two-stage framework designed to transition Large Language Models from memorization towards chemical reasoning.<n>First, we propose Mol-SFT, which initializes the model's reasoning abilities via synthetic Chain-of-Thought(CoT) samples generated by GPT-4o and verified for chemical accuracy.<n>Subsequently, Mol-RL applies reinforcement learning with specialized reward functions designed explicitly to align chemical structures with linguistic descriptions.
arXiv Detail & Related papers (2025-08-04T05:10:11Z) - Boosting LLM's Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning [35.02138874108029]
Large language models (LLMs) have shown remarkable proficiency in analyzing and reasoning through complex tasks.<n>We introduce a Knowledge-enhanced reasoning framework for Molecular Structure Elucidation (K-MSE), leveraging Monte Carlo Tree Search for test-time scaling as a plugin.<n> Experimental results show that our approach significantly boosts performance, particularly gaining more than 20% improvement on both GPT-4o-mini and GPT-4o.
arXiv Detail & Related papers (2025-06-29T02:00:38Z) - Improving Chemical Understanding of LLMs via SMILES Parsing [18.532188836688928]
CLEANMOL is a novel framework that formulates SMILES parsing into a suite of clean and deterministic tasks.<n>We construct a molecular pretraining dataset with adaptive difficulty scoring and pre-train open-source LLMs on these tasks.<n>Our results show that CLEANMOL not only enhances structural comprehension but also achieves the best or competes with the baseline on the Mol-Instructions benchmark.
arXiv Detail & Related papers (2025-05-22T07:54:39Z) - DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra [60.39311767532607]
We present DiffMS, a formula-restricted encoder-decoder generative network that achieves state-of-the-art performance on this task.<n>To develop a robust decoder that bridges latent embeddings and molecular structures, we pretrain the diffusion decoder with fingerprint-structure pairs.<n>Experiments on established benchmarks show that DiffMS outperforms existing models on de novo molecule generation.
arXiv Detail & Related papers (2025-02-13T18:29:48Z) - Structural Reasoning Improves Molecular Understanding of LLM [18.532188836688928]
We show that large language models (LLMs) still struggle to reason using molecular structural information.<n>We propose an approach that sketches molecular structures for reasoning.<n>We present two frameworks for scenarios where the target molecule is known or unknown.
arXiv Detail & Related papers (2024-10-08T01:49:48Z) - Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective [53.300288393173204]
Large Language Models (LLMs) have shown remarkable performance in various cross-modal tasks.
In this work, we propose an In-context Few-Shot Molecule Learning paradigm for molecule-caption translation.
We evaluate the effectiveness of MolReGPT on molecule-caption translation, including molecule understanding and text-based molecule generation.
arXiv Detail & Related papers (2023-06-11T08:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.