Modeling All-Atom Glycan Structures via Hierarchical Message Passing and Multi-Scale Pre-training
- URL: http://arxiv.org/abs/2506.01376v1
- Date: Mon, 02 Jun 2025 07:08:39 GMT
- Title: Modeling All-Atom Glycan Structures via Hierarchical Message Passing and Multi-Scale Pre-training
- Authors: Minghao Xu, Jiaze Song, Keming Wu, Xiangxin Zhou, Bin Cui, Wentao Zhang,
- Abstract summary: We introduce the GlycanAA model for All-Atom-wise glycan modeling.<n>GlycanAA performs hierarchical message passing to capture from local atomic-level interactions to global monosaccharide-level interactions.<n>We design a multi-scale mask prediction algorithm to endow the model about different levels of dependencies in a glycan.
- Score: 37.76325239977169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the various properties of glycans with machine learning has shown some preliminary promise. However, previous methods mainly focused on modeling the backbone structure of glycans as graphs of monosaccharides (i.e., sugar units), while they neglected the atomic structures underlying each monosaccharide, which are actually important indicators of glycan properties. We fill this blank by introducing the GlycanAA model for All-Atom-wise Glycan modeling. GlycanAA models a glycan as a heterogeneous graph with monosaccharide nodes representing its global backbone structure and atom nodes representing its local atomic-level structures. Based on such a graph, GlycanAA performs hierarchical message passing to capture from local atomic-level interactions to global monosaccharide-level interactions. To further enhance model capability, we pre-train GlycanAA on a high-quality unlabeled glycan dataset, deriving the PreGlycanAA model. We design a multi-scale mask prediction algorithm to endow the model about different levels of dependencies in a glycan. Extensive benchmark results show the superiority of GlycanAA over existing glycan encoders and verify the further improvements achieved by PreGlycanAA. We maintain all resources at https://github.com/kasawa1234/GlycanAA
Related papers
- Type 1 Diabetes Management using GLIMMER: Glucose Level Indicator Model with Modified Error Rate [6.102406188211489]
We introduce GLIMMER, a machine learning-based model for predicting blood glucose levels.<n>GLIMMER classifies glucose values into normal and abnormal ranges and employs a novel custom loss function.<n>These results represent a 23% improvement in RMSE and a 31% improvement in MAE compared to the best previously reported models.
arXiv Detail & Related papers (2025-02-20T01:26:00Z) - Higher-Order Message Passing for Glycan Representation Learning [0.0]
Graph Networks (GNNs) are deep learning models designed to process and analyze graph-structured data.<n>This work presents a new model architecture based on complexes and higher-order message passing to extract features from glycan structures into latent space representation.<n>We envision that these improvements will spur further advances in computational glycosciences and reveal the roles of glycans in biology.
arXiv Detail & Related papers (2024-09-20T12:55:43Z) - GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning [35.818061926699336]
GlycanML benchmark consists of diverse types of tasks including glycan taxonomy prediction, glycan immunogenicity prediction, glycosylation type prediction, and protein-glycan interaction prediction.
By concurrently performing eight glycan taxonomy prediction tasks, we introduce the GlycanML-MTL testbed for multi-task learning (MTL) algorithms.
Experimental results show the superiority of modeling glycans with multi-relational GNNs, and suitable MTL methods can further boost model performance.
arXiv Detail & Related papers (2024-05-25T12:35:31Z) - Hi-GMAE: Hierarchical Graph Masked Autoencoders [90.30572554544385]
Hierarchical Graph Masked AutoEncoders (Hi-GMAE)
Hi-GMAE is a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs.
Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.
arXiv Detail & Related papers (2024-05-17T09:08:37Z) - Blood Glucose Level Prediction: A Graph-based Explainable Method with
Federated Learning [1.6317061277457001]
In the UK, approximately 400,000 people with type 1 diabetes rely on insulin delivery due to insufficient pancreatic insulin production.
CGM, tracking BG every 5 minutes, enables effective blood glucose level prediction (BGLP) by considering factors like carbohydrate intake and insulin delivery.
arXiv Detail & Related papers (2023-12-19T19:19:35Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations [68.32093648671496]
We introduce GODE, which accounts for the dual-level structure inherent in molecules.<n> Molecules possess an intrinsic graph structure and simultaneously function as nodes within a broader molecular knowledge graph.<n>By pre-training two GNNs on different graph structures, GODE effectively fuses molecular structures with their corresponding knowledge graph substructures.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Atomic and Subgraph-aware Bilateral Aggregation for Molecular
Representation Learning [57.670845619155195]
We introduce a new model for molecular representation learning called the Atomic and Subgraph-aware Bilateral Aggregation (ASBA)
ASBA addresses the limitations of previous atom-wise and subgraph-wise models by incorporating both types of information.
Our method offers a more comprehensive way to learn representations for molecular property prediction and has broad potential in drug and material discovery applications.
arXiv Detail & Related papers (2023-05-22T00:56:00Z) - Learning Graph Models for Retrosynthesis Prediction [90.15523831087269]
Retrosynthesis prediction is a fundamental problem in organic synthesis.
This paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction.
Our model achieves a top-1 accuracy of $53.7%$, outperforming previous template-free and semi-template-based methods.
arXiv Detail & Related papers (2020-06-12T09:40:42Z) - Global Attention based Graph Convolutional Neural Networks for Improved
Materials Property Prediction [8.371766047183739]
We develop a novel model, GATGNN, for predicting inorganic material properties based on graph neural networks.
We show that our method is able to both outperform the previous models' predictions and provide insight into the crystallization of the material.
arXiv Detail & Related papers (2020-03-11T07:43:14Z) - Heterogeneous Graph Transformer [49.675064816860505]
Heterogeneous Graph Transformer (HGT) architecture for modeling Web-scale heterogeneous graphs.
To handle dynamic heterogeneous graphs, we introduce the relative temporal encoding technique into HGT.
To handle Web-scale graph data, we design the heterogeneous mini-batch graph sampling algorithm---HGSampling---for efficient and scalable training.
arXiv Detail & Related papers (2020-03-03T04:49:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.