ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep
Learning
- URL: http://arxiv.org/abs/2002.09914v1
- Date: Sun, 23 Feb 2020 14:30:55 GMT
- Title: ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep
Learning
- Authors: Martijn Oldenhof, Adam Arany, Yves Moreau and Jaak Simm
- Abstract summary: In drug discovery, knowledge of the graph structure of chemical compounds is essential.
A tool to analyze images automatically and convert them into a chemical graph structure would be useful for many applications.
We develop a deep neural network model for optical compound recognition.
- Score: 6.88204255655161
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In drug discovery, knowledge of the graph structure of chemical compounds is
essential. Many thousands of scientific articles in chemistry and
pharmaceutical sciences have investigated chemical compounds, but in cases the
details of the structure of these chemical compounds is published only as an
images. A tool to analyze these images automatically and convert them into a
chemical graph structure would be useful for many applications, such drug
discovery. A few such tools are available and they are mostly derived from
optical character recognition. However, our evaluation of the performance of
those tools reveals that they make often mistakes in detecting the correct bond
multiplicity and stereochemical information. In addition, errors sometimes even
lead to missing atoms in the resulting graph. In our work, we address these
issues by developing a compound recognition method based on machine learning.
More specifically, we develop a deep neural network model for optical compound
recognition. The deep learning solution presented here consists of a
segmentation model, followed by three classification models that predict atom
locations, bonds and charges. Furthermore, this model not only predicts the
graph structure of the molecule but also produces all information necessary to
relate each component of the resulting graph to the source image. This solution
is scalable and could rapidly process thousands of images. Finally, we compare
empirically the proposed method to a well-established tool and observe
significant error reductions.
Related papers
- MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Conditional Graph Information Bottleneck for Molecular Relational
Learning [9.56625683182106]
We propose a novel relational learning framework, CGIB, that predicts the interaction behavior between a pair of graphs by detecting core subgraphs therein.
Our proposed method mimics the nature of chemical reactions, i.e., the core substructure of a molecule varies depending on which other molecule it interacts with.
arXiv Detail & Related papers (2023-04-29T01:17:43Z) - Graph-based Molecular Representation Learning [59.06193431883431]
Molecular representation learning (MRL) is a key step to build the connection between machine learning and chemical science.
Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning.
arXiv Detail & Related papers (2022-07-08T17:43:20Z) - Rxn Hypergraph: a Hypergraph Attention Model for Chemical Reaction
Representation [70.97737157902947]
There is currently no universal and widely adopted method for robustly representing chemical reactions.
Here we exploit graph-based representations of molecular structures to develop and test a hypergraph attention neural network approach.
We evaluate this hypergraph representation in three experiments using three independent data sets of chemical reactions.
arXiv Detail & Related papers (2022-01-02T12:33:10Z) - Molecular Graph Generation via Geometric Scattering [7.796917261490019]
Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery.
We propose a representation-first approach to molecular graph generation.
We show that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.
arXiv Detail & Related papers (2021-10-12T18:00:23Z) - Advanced Graph and Sequence Neural Networks for Molecular Property
Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations.
Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data.
Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z) - BERT Learns (and Teaches) Chemistry [5.653789128055942]
We propose the use of attention to study functional groups and other property-impacting molecular substructures from a data-driven perspective.
We then apply the representations of functional groups and atoms learned by the model to tackle problems of toxicity, solubility, drug-likeness, and accessibility.
arXiv Detail & Related papers (2020-07-11T00:23:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.