Related papers: Image-to-Graph Transformers for Chemical Structure Recognition

Image-to-Graph Transformers for Chemical Structure Recognition

URL: http://arxiv.org/abs/2202.09580v1
Date: Sat, 19 Feb 2022 11:33:54 GMT
Title: Image-to-Graph Transformers for Chemical Structure Recognition
Authors: Sanghyun Yoo, Ohyun Kwon, Hoshik Lee
Abstract summary: We present a deep learning model to extract molecular structures from images. The proposed model is designed to transform the molecular image directly into the corresponding graph. By end-to-end learning approach, it can fully utilize many open image-molecule pair data from various sources.
Score: 4.180435324231826
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: For several decades, chemical knowledge has been published in written text, and there have been many attempts to make it accessible, for example, by transforming such natural language text to a structured format. Although the discovered chemical itself commonly represented in an image is the most important part, the correct recognition of the molecular structure from the image in literature still remains a hard problem since they are often abbreviated to reduce the complexity and drawn in many different styles. In this paper, we present a deep learning model to extract molecular structures from images. The proposed model is designed to transform the molecular image directly into the corresponding graph, which makes it capable of handling non-atomic symbols for abbreviations. Also, by end-to-end learning approach it can fully utilize many open image-molecule pair data from various sources, and hence it is more robust to image style variation than other tools. The experimental results show that the proposed model outperforms the existing models with 17.1 % and 12.8 % relative improvement for well-known benchmark datasets and large molecular images that we collected from literature, respectively.

Related papers

SubGrapher: Visual Fingerprinting of Chemical Structures [46.677062201188015]
SubGrapher is a method for the visual fingerprinting of chemical structure images. Unlike conventional Optical Chemical Structure Recognition (OCSR) models that attempt to reconstruct full molecular graphs, SubGrapher focuses on extracting molecular fingerprints directly from chemical structure images. Our approach is evaluated against state-of-the-art OCSR and fingerprinting methods, demonstrating superior retrieval performance and robustness across diverse molecular depictions.
arXiv Detail & Related papers (2025-04-28T11:45:46Z)
MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild [23.78185449646608]
We present Mol, a novel end-to-end optical chemical structure recognition method. We use a SMILES encoding rule to annotate Mol-7M, the largest annotated molecular image dataset. We trained an end-to-end molecular image captioning model, Mol, using a curriculum learning approach.
arXiv Detail & Related papers (2024-11-17T15:00:09Z)
GraphXForm: Graph transformer for computer-aided molecular design with application to extraction [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned. We evaluate it on two solvent design tasks for liquid-liquid extraction, showing that it outperforms four state-of-the-art molecular design techniques.
arXiv Detail & Related papers (2024-11-03T19:45:15Z)
Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method. HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution. Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z)
MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition [4.510482519069965]
MolNexTR is a novel image-to-graph deep learning model that collaborates to fuse the strengths of ConvNext and Vision-TRansformer. It can predict atoms and bonds simultaneously and understand their layout rules. In our test sets, MolNexTR has demonstrated superior performance, achieving an accuracy rate of 81-97%.
arXiv Detail & Related papers (2024-03-06T13:17:41Z)
MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually. We treat all candidate atoms and bonds as nodes and put them in a graph. We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z)
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text [25.979382232281786]
We introduce GIT-Mol, a multi-modal large language model that integrates the Graph, Image, and Text information. We achieve a 5%-10% accuracy increase in properties prediction and a 20.2% boost in molecule generation validity.
arXiv Detail & Related papers (2023-08-14T03:12:29Z)
A Molecular Multimodal Foundation Model Associating Molecule Graphs with Natural Language [63.60376252491507]
We propose a molecular multimodal foundation model which is pretrained from molecular graphs and their semantically related textual data. We believe that our model would have a broad impact on AI-empowered fields across disciplines such as biology, chemistry, materials, environment, and medicine.
arXiv Detail & Related papers (2022-09-12T00:56:57Z)
MolScribe: Robust Molecular Structure Recognition with Image-To-Graph Generation [28.93523736883784]
MolScribe is an image-to-graph model that explicitly predicts atoms and bonds, along with their geometric layouts, to construct the molecular structure. MolScribe significantly outperforms previous models, achieving 76-93% accuracy on public benchmarks.
arXiv Detail & Related papers (2022-05-28T03:03:45Z)
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding [53.170767750244366]
Imagen is a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models.
arXiv Detail & Related papers (2022-05-23T17:42:53Z)
Conditional Constrained Graph Variational Autoencoders for Molecule Design [70.59828655929194]
We present Conditional Constrained Graph Variational Autoencoder (CCGVAE), a model that implements this key-idea in a state-of-the-art model. We show improved results on several evaluation metrics on two commonly adopted datasets for molecule generation.
arXiv Detail & Related papers (2020-09-01T21:58:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.