ChemATP: A Training-Free Chemical Reasoning Framework for Large Language Models
- URL: http://arxiv.org/abs/2512.19240v1
- Date: Mon, 22 Dec 2025 10:21:40 GMT
- Title: ChemATP: A Training-Free Chemical Reasoning Framework for Large Language Models
- Authors: Mingxu Zhang, Dazhong Shen, Qi Zhang, Ying Sun,
- Abstract summary: ChemATP is a framework that decouples chemical knowledge from the reasoning engine.<n>ChemATP significantly outperforms training-free baselines and rivals state-of-the-art training-based models.
- Score: 16.47599278238931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) exhibit strong general reasoning but struggle in molecular science due to the lack of explicit chemical priors in standard string representations. Current solutions face a fundamental dilemma. Training-based methods inject priors into parameters, but this static coupling hinders rapid knowledge updates and often compromises the model's general reasoning capabilities. Conversely, existing training-free methods avoid these issues but rely on surface-level prompting, failing to provide the fine-grained atom-level priors essential for precise chemical reasoning. To address this issue, we introduce ChemATP, a framework that decouples chemical knowledge from the reasoning engine. By constructing the first atom-level textual knowledge base, ChemATP enables frozen LLMs to explicitly retrieve and reason over this information dynamically. This architecture ensures interpretability and adaptability while preserving the LLM's intrinsic general intelligence. Experiments show that ChemATP significantly outperforms training-free baselines and rivals state-of-the-art training-based models, demonstrating that explicit prior injection is a competitive alternative to implicit parameter updates.
Related papers
- LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning [107.60117957760794]
We introduce LatentChem, a latent reasoning interface that decouples chemical derivation from textual generation.<n>We show that LatentChem achieves a 59.88% non-tie win rate over strong CoT-based baselines on ChemCoTBench.<n>Our results provide empirical evidence that chemical reasoning is more naturally and effectively realized as continuous latent dynamics.
arXiv Detail & Related papers (2026-02-06T01:28:27Z) - Agentic reinforcement learning empowers next-generation chemical language models for molecular design and synthesis [51.83339196548892]
ChemCraft is a novel framework that decouples chemical reasoning from knowledge storage.<n>ChemCraft achieves superior performance with minimal inference costs.<n>This work establishes a cost-effective and privacy-preserving paradigm for AI-aided chemistry.
arXiv Detail & Related papers (2026-01-25T04:23:34Z) - Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs [78.18336140706471]
Sci-LLMs have emerged as a promising frontier for accelerating biological discovery.<n>Current strategies limit Sci-LLMs' reasoning capacity when processing raw biomolecular sequences.<n>We show that a more effective strategy is to provide Sci-LLMs with high-level structured context.
arXiv Detail & Related papers (2025-10-27T09:03:21Z) - Atom-anchored LLMs speak Chemistry: A Retrosynthesis Demonstration [2.9496795797433073]
We introduce a framework for molecular reasoning using general-purpose Large Language Models.<n>Our method anchors chain-of-thought reasoning to the molecular structure by using unique atomic identifiers.<n>Our work also provides a method to generate theoretically grounded synthetic datasets.
arXiv Detail & Related papers (2025-10-18T17:27:44Z) - MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs [30.030008221150407]
MolReasoner is a two-stage framework designed to transition Large Language Models from memorization towards chemical reasoning.<n>First, we propose Mol-SFT, which initializes the model's reasoning abilities via synthetic Chain-of-Thought(CoT) samples generated by GPT-4o and verified for chemical accuracy.<n>Subsequently, Mol-RL applies reinforcement learning with specialized reward functions designed explicitly to align chemical structures with linguistic descriptions.
arXiv Detail & Related papers (2025-08-04T05:10:11Z) - ChemDFM-R: An Chemical Reasoner LLM Enhanced with Atomized Chemical Knowledge [14.6026550444088]
This work focuses on the specific field of chemistry and develop a Chemical Reasoner LLM, ChemDFM-R.<n>We first construct a comprehensive dataset of atomized knowledge points to enhance the model's understanding of the fundamental principles and logical structure of chemistry.<n> Experiments on diverse chemical benchmarks demonstrate that ChemDFM-R achieves cutting-edge performance while providing interpretable, rationale-driven outputs.
arXiv Detail & Related papers (2025-07-29T16:40:49Z) - ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z) - Training a Scientific Reasoning Model for Chemistry [3.52064464182155]
We demonstrate that reasoning models can be post-trained for chemistry without additional domain pretraining.<n>We report ether0, a 24B parameter LLM that can reason in natural language and respond with chemical structures.
arXiv Detail & Related papers (2025-06-04T17:57:18Z) - Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations [43.623140005091535]
We introduce ChemCoTBench, a reasoning framework that bridges molecular structure understanding with arithmetic-inspired operations.<n>ChemCoTBench formalizes chemical problem-solving into transparent, step-by-step reasoning.<n>We evaluate models on two high-impact tasks: Molecular Property Optimization and Chemical Reaction Prediction.
arXiv Detail & Related papers (2025-05-27T15:15:44Z) - ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning [64.2106664137118]
ChemAgent is a novel framework designed to improve the performance of large language models (LLMs)<n>It is developed by decomposing chemical tasks into sub-tasks and compiling these sub-tasks into a structured collection that can be referenced for future queries.<n>When presented with a new problem, ChemAgent retrieves and refines pertinent information from the library, which we call memory.
arXiv Detail & Related papers (2025-01-11T17:10:30Z) - ChemLLM: A Chemical Large Language Model [49.308528569982805]
Large language models (LLMs) have made impressive progress in chemistry applications.
However, the community lacks an LLM specifically designed for chemistry.
Here, we introduce ChemLLM, a comprehensive framework that features the first LLM dedicated to chemistry.
arXiv Detail & Related papers (2024-02-10T01:11:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.