Related papers: Towards Large-scale Chemical Reaction Image Parsing via a Multimodal Large Language Model

Towards Large-scale Chemical Reaction Image Parsing via a Multimodal Large Language Model

URL: http://arxiv.org/abs/2503.08156v1
Date: Tue, 11 Mar 2025 08:11:23 GMT
Title: Towards Large-scale Chemical Reaction Image Parsing via a Multimodal Large Language Model
Authors: Yufan Chen, Ching Ting Leung, Jianwei Sun, Yong Huang, Linyan Li, Hao Chen, Hanyu Gao,
Abstract summary: We introduce the Reaction Image Multimodal large language model (RxnIM) to parse chemical reaction images into machine-readable data.<n> RxnIM extracts key chemical components from reaction images and interprets the textual content that describes reaction conditions.<n>Our approach achieves excellent performance, with an average F1 score of 88% on various benchmarks, surpassing literature methods by 5%.
Score: 4.860497022313892
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artificial intelligence (AI) has demonstrated significant promise in advancing organic chemistry research; however, its effectiveness depends on the availability of high-quality chemical reaction data. Currently, most published chemical reactions are not available in machine-readable form, limiting the broader application of AI in this field. The extraction of published chemical reactions into structured databases still relies heavily on manual curation, and robust automatic parsing of chemical reaction images into machine-readable data remains a significant challenge. To address this, we introduce the Reaction Image Multimodal large language model (RxnIM), the first multimodal large language model specifically designed to parse chemical reaction images into machine-readable reaction data. RxnIM not only extracts key chemical components from reaction images but also interprets the textual content that describes reaction conditions. Together with specially designed large-scale dataset generation method to support model training, our approach achieves excellent performance, with an average F1 score of 88% on various benchmarks, surpassing literature methods by 5%. This represents a crucial step toward the automatic construction of large databases of machine-readable reaction data parsed from images in the chemistry literature, providing essential data resources for AI research in chemistry. The source code, model checkpoints, and datasets developed in this work are released under permissive licenses. An instance of the RxnIM web application can be accessed at https://huggingface.co/spaces/CYF200127/RxnIM.

Related papers

A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature [8.306442315850878]
We develop a multimodal large language model (MLLM)-based multi-agent system for robust and automated chemical information extraction.<n>Our system achieved an F1 score of 80.8% on a benchmark dataset of sophisticated multimodal chemical reaction graphics from the literature.
arXiv Detail & Related papers (2025-07-27T11:16:57Z)
ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z)
Interpretable Deep Learning for Polar Mechanistic Reaction Prediction [43.95903801494905]
We introduce PMechRP (Polar Mechanistic Reaction Predictor), a system that trains machine learning models on the PMechDB dataset. We train compare a range of machine learning models, including transformer-based, graph-based and two-step siamese architectures. Our best-performing approach was a hybrid model, which combines a 5-ensemble of Chemformer models with a two-step Siamese framework.
arXiv Detail & Related papers (2025-04-22T02:31:23Z)
Learning Chemical Reaction Representation with Reactant-Product Alignment [50.28123475356234]
RAlign is a novel chemical reaction representation learning model for various organic reaction-related tasks.<n>By integrating atomic correspondence between reactants and products, our model discerns the molecular transformations that occur during the reaction.<n>We introduce a reaction-center-aware attention mechanism that enables the model to concentrate on key functional groups.
arXiv Detail & Related papers (2024-11-26T17:41:44Z)
Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation [50.639325453203504]
MM-RCR is a text-augmented multimodal LLM that learns a unified reaction representation from SMILES, reaction graphs, and textual corpus for chemical reaction recommendation (RCR) Our results demonstrate that MM-RCR achieves state-of-the-art performance on two open benchmark datasets.
arXiv Detail & Related papers (2024-07-21T12:27:26Z)
OpenChemIE: An Information Extraction Toolkit For Chemistry Literature [37.23189665773341]
OpenChemIE is a tool for extracting reaction data from chemistry literature. We employ specialized neural models that address a specific task for chemistry information extraction. We meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole.
arXiv Detail & Related papers (2024-04-01T20:16:21Z)
Contextual Molecule Representation Learning from Chemical Reaction Knowledge [24.501564702095937]
We introduce REMO, a self-supervised learning framework that takes advantage of well-defined atom-combination rules in common chemistry. REMO pre-trains graph/Transformer encoders on 1.7 million known chemical reactions in the literature.
arXiv Detail & Related papers (2024-02-21T12:58:40Z)
An Autonomous Large Language Model Agent for Chemical Literature Data Mining [60.85177362167166]
We introduce an end-to-end AI agent framework capable of high-fidelity extraction from extensive chemical literature. Our framework's efficacy is evaluated using accuracy, recall, and F1 score of reaction condition data.
arXiv Detail & Related papers (2024-02-20T13:21:46Z)
Retrosynthesis prediction enhanced by in-silico reaction data augmentation [66.5643280109899]
We present RetroWISE, a framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation. On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models.
arXiv Detail & Related papers (2024-01-31T07:40:37Z)
Predictive Chemistry Augmented with Text Retrieval [37.59545092901872]
We introduce TextReact, a novel method that directly augments predictive chemistry with texts retrieved from the literature. TextReact retrieves text descriptions relevant for a given chemical reaction, and then aligns them with the molecular representation of the reaction. We empirically validate the framework on two chemistry tasks: reaction condition recommendation and one-step retrosynthesis.
arXiv Detail & Related papers (2023-12-08T07:40:59Z)
Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis [57.70772230913099]
Chemist-X automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology. Chemist-X interrogates online molecular databases and distills critical data from the latest literature database. Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems.
arXiv Detail & Related papers (2023-11-16T01:21:33Z)
ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision [27.850325653751078]
structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design. Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts. We propose ReactIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions.
arXiv Detail & Related papers (2023-07-04T02:52:30Z)
Rxn Hypergraph: a Hypergraph Attention Model for Chemical Reaction Representation [70.97737157902947]
There is currently no universal and widely adopted method for robustly representing chemical reactions. Here we exploit graph-based representations of molecular structures to develop and test a hypergraph attention neural network approach. We evaluate this hypergraph representation in three experiments using three independent data sets of chemical reactions.
arXiv Detail & Related papers (2022-01-02T12:33:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.