MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition
- URL: http://arxiv.org/abs/2403.03691v3
- Date: Wed, 28 Aug 2024 03:57:26 GMT
- Title: MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition
- Authors: Yufan Chen, Ching Ting Leung, Yong Huang, Jianwei Sun, Hao Chen, Hanyu Gao,
- Abstract summary: MolNexTR is a novel image-to-graph deep learning model that collaborates to fuse the strengths of ConvNext and Vision-TRansformer.
It can predict atoms and bonds simultaneously and understand their layout rules.
In our test sets, MolNexTR has demonstrated superior performance, achieving an accuracy rate of 81-97%.
- Score: 4.510482519069965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the field of chemical structure recognition, the task of converting molecular images into machine-readable data formats such as SMILES string stands as a significant challenge, primarily due to the varied drawing styles and conventions prevalent in chemical literature. To bridge this gap, we proposed MolNexTR, a novel image-to-graph deep learning model that collaborates to fuse the strengths of ConvNext, a powerful Convolutional Neural Network variant, and Vision-TRansformer. This integration facilitates a more detailed extraction of both local and global features from molecular images. MolNexTR can predict atoms and bonds simultaneously and understand their layout rules. It also excels at flexibly integrating symbolic chemistry principles to discern chirality and decipher abbreviated structures. We further incorporate a series of advanced algorithms, including an improved data augmentation module, an image contamination module, and a post-processing module for getting the final SMILES output. These modules cooperate to enhance the model's robustness to diverse styles of molecular images found in real literature. In our test sets, MolNexTR has demonstrated superior performance, achieving an accuracy rate of 81-97%, marking a significant advancement in the domain of molecular structure recognition.
Related papers
- MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild [23.78185449646608]
We present Mol, a novel end-to-end optical chemical structure recognition method.
We use a SMILES encoding rule to annotate Mol-7M, the largest annotated molecular image dataset.
We trained an end-to-end molecular image captioning model, Mol, using a curriculum learning approach.
arXiv Detail & Related papers (2024-11-17T15:00:09Z) - GraphXForm: Graph transformer for computer-aided molecular design with application to extraction [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned.
We evaluate it on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms.
This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z) - FARM: Functional Group-Aware Representations for Small Molecules [55.281754551202326]
We introduce Functional Group-Aware Representations for Small Molecules (FARM)
FARM is a foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs.
We rigorously evaluate FARM on the MoleculeNet dataset, where it achieves state-of-the-art performance on 10 out of 12 tasks.
arXiv Detail & Related papers (2024-10-02T23:04:58Z) - Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method.
HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution.
Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z) - MultiModal-Learning for Predicting Molecular Properties: A Framework Based on Image and Graph Structures [2.5563339057415218]
MolIG is a novel MultiModaL molecular pre-training framework for predicting molecular properties based on Image and Graph structures.
It amalgamates the strengths of both molecular representation forms.
It exhibits enhanced performance in downstream tasks pertaining to molecular property prediction within benchmark groups.
arXiv Detail & Related papers (2023-11-28T10:28:35Z) - A Molecular Multimodal Foundation Model Associating Molecule Graphs with
Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data.
We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z) - MolScribe: Robust Molecular Structure Recognition with Image-To-Graph
Generation [28.93523736883784]
MolScribe is an image-to-graph model that explicitly predicts atoms and bonds, along with their geometric layouts, to construct the molecular structure.
MolScribe significantly outperforms previous models, achieving 76-93% accuracy on public benchmarks.
arXiv Detail & Related papers (2022-05-28T03:03:45Z) - Image-to-Graph Transformers for Chemical Structure Recognition [4.180435324231826]
We present a deep learning model to extract molecular structures from images.
The proposed model is designed to transform the molecular image directly into the corresponding graph.
By end-to-end learning approach, it can fully utilize many open image-molecule pair data from various sources.
arXiv Detail & Related papers (2022-02-19T11:33:54Z) - Improved Conditional Flow Models for Molecule to Image Synthesis [37.886816307827196]
Mol2Image is a flow-based generative model for molecule to cell image synthesis.
To generate cell features at different resolutions and scale to high-resolution images, we develop a novel multi-scale flow architecture.
To maximize the mutual information between the generated images and the molecular interventions, we devise a training strategy based on contrastive learning.
arXiv Detail & Related papers (2020-06-15T16:39:50Z) - Multi-View Graph Neural Networks for Molecular Property Prediction [67.54644592806876]
We present Multi-View Graph Neural Network (MV-GNN), a multi-view message passing architecture.
In MV-GNN, we introduce a shared self-attentive readout component and disagreement loss to stabilize the training process.
We further boost the expressive power of MV-GNN by proposing a cross-dependent message passing scheme.
arXiv Detail & Related papers (2020-05-17T04:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.