Graph Data Modeling: Molecules, Proteins, & Chemical Processes
- URL: http://arxiv.org/abs/2508.19356v3
- Date: Tue, 23 Sep 2025 16:53:18 GMT
- Title: Graph Data Modeling: Molecules, Proteins, & Chemical Processes
- Authors: José Manuel Barraza-Chavez, Rana A. Barghout, Ricardo Almada-Monter, Adrian Jinich, Radhakrishnan Mahadevan, Benjamin Sanchez-Lengeling,
- Abstract summary: Graphs are central to the chemical sciences, providing a natural language to describe molecules, proteins, reactions, and industrial processes.<n>This primer introduces graphs as mathematical objects in chemistry and shows how learning algorithms can operate on them.<n>We outline the foundations of graph design, key prediction tasks, representative examples across chemical sciences, and the role of machine learning in graph-based modeling.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Graphs are central to the chemical sciences, providing a natural language to describe molecules, proteins, reactions, and industrial processes. They capture interactions and structures that underpin materials, biology, and medicine. This primer, Graph Data Modeling: Molecules, Proteins, & Chemical Processes, introduces graphs as mathematical objects in chemistry and shows how learning algorithms (particularly graph neural networks) can operate on them. We outline the foundations of graph design, key prediction tasks, representative examples across chemical sciences, and the role of machine learning in graph-based modeling. Together, these concepts prepare readers to apply graph methods to the next generation of chemical discovery.
Related papers
- MolGrapher: Graph-based Visual Recognition of Chemical Structures [50.13749978547401]
We introduce MolGrapher to recognize chemical structures visually.
We treat all candidate atoms and bonds as nodes and put them in a graph.
We classify atom and bond nodes in the graph with a Graph Neural Network.
arXiv Detail & Related papers (2023-08-23T16:16:11Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations [68.32093648671496]
We introduce GODE, which accounts for the dual-level structure inherent in molecules.<n> Molecules possess an intrinsic graph structure and simultaneously function as nodes within a broader molecular knowledge graph.<n>By pre-training two GNNs on different graph structures, GODE effectively fuses molecular structures with their corresponding knowledge graph substructures.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Conditional Graph Information Bottleneck for Molecular Relational
Learning [9.56625683182106]
We propose a novel relational learning framework, CGIB, that predicts the interaction behavior between a pair of graphs by detecting core subgraphs therein.
Our proposed method mimics the nature of chemical reactions, i.e., the core substructure of a molecule varies depending on which other molecule it interacts with.
arXiv Detail & Related papers (2023-04-29T01:17:43Z) - Graph neural networks for the prediction of molecular structure-property
relationships [59.11160990637615]
Graph neural networks (GNNs) are a novel machine learning method that directly work on the molecular graph.
GNNs allow to learn properties in an end-to-end fashion, thereby avoiding the need for informative descriptors.
We describe the fundamentals of GNNs and demonstrate the application of GNNs via two examples for molecular property prediction.
arXiv Detail & Related papers (2022-07-25T11:30:44Z) - Graph-based Molecular Representation Learning [59.06193431883431]
Molecular representation learning (MRL) is a key step to build the connection between machine learning and chemical science.
Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning.
arXiv Detail & Related papers (2022-07-08T17:43:20Z) - MolScribe: Robust Molecular Structure Recognition with Image-To-Graph
Generation [28.93523736883784]
MolScribe is an image-to-graph model that explicitly predicts atoms and bonds, along with their geometric layouts, to construct the molecular structure.
MolScribe significantly outperforms previous models, achieving 76-93% accuracy on public benchmarks.
arXiv Detail & Related papers (2022-05-28T03:03:45Z) - Molecular Contrastive Learning with Chemical Element Knowledge Graph [16.136921143416927]
Molecular representation learning contributes to multiple downstream tasks such as molecular property prediction and drug design.
We construct a Chemical Element Knowledge Graph (KG) to summarize microscopic associations between elements.
The first module, knowledge-guided graph augmentation, augments the original molecular graph based on the Chemical Element KG.
The second module, knowledge-aware graph representation, extracts molecular representations with a common graph encoder for the original molecular graph and a Knowledge-aware Message Passing Neural Network (KMPNN) to encode complex information in the augmented molecular graph.
arXiv Detail & Related papers (2021-12-01T15:04:39Z) - Advanced Graph and Sequence Neural Networks for Molecular Property
Prediction and Drug Discovery [53.00288162642151]
We develop MoleculeKit, a suite of comprehensive machine learning tools spanning different computational models and molecular representations.
Built on these representations, MoleculeKit includes both deep learning and traditional machine learning methods for graph and sequence data.
Results on both online and offline antibiotics discovery and molecular property prediction tasks show that MoleculeKit achieves consistent improvements over prior methods.
arXiv Detail & Related papers (2020-12-02T02:09:31Z) - GraphOpt: Learning Optimization Models of Graph Formation [72.75384705298303]
We propose an end-to-end framework that learns an implicit model of graph structure formation and discovers an underlying optimization mechanism.
The learned objective can serve as an explanation for the observed graph properties, thereby lending itself to transfer across different graphs within a domain.
GraphOpt poses link formation in graphs as a sequential decision-making process and solves it using maximum entropy inverse reinforcement learning algorithm.
arXiv Detail & Related papers (2020-07-07T16:51:39Z) - MoFlow: An Invertible Flow Model for Generating Molecular Graphs [19.829612234339578]
MoFlow is a flow-based graph generative model to learn invertible mappings between molecular graphs and latent representations.
Our model has merits including exact and tractable likelihood training, efficient one-pass embedding and generation, chemical validity guarantees, 100% reconstruction of training data, and good generalization ability.
arXiv Detail & Related papers (2020-06-17T20:14:19Z) - Learning Graph Models for Retrosynthesis Prediction [90.15523831087269]
Retrosynthesis prediction is a fundamental problem in organic synthesis.
This paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction.
Our model achieves a top-1 accuracy of $53.7%$, outperforming previous template-free and semi-template-based methods.
arXiv Detail & Related papers (2020-06-12T09:40:42Z) - ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep
Learning [6.88204255655161]
In drug discovery, knowledge of the graph structure of chemical compounds is essential.
A tool to analyze images automatically and convert them into a chemical graph structure would be useful for many applications.
We develop a deep neural network model for optical compound recognition.
arXiv Detail & Related papers (2020-02-23T14:30:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.