GastroDL-Fusion: A Dual-Modal Deep Learning Framework Integrating Protein-Ligand Complexes and Gene Sequences for Gastrointestinal Disease Drug Discovery
- URL: http://arxiv.org/abs/2511.05726v1
- Date: Fri, 07 Nov 2025 21:32:58 GMT
- Title: GastroDL-Fusion: A Dual-Modal Deep Learning Framework Integrating Protein-Ligand Complexes and Gene Sequences for Gastrointestinal Disease Drug Discovery
- Authors: Ziyang Gao, Annie Cheung, Yihao Ou,
- Abstract summary: GastroDL-Fusion is a dual-modal deep learning framework that integrates protein-ligand complex data with disease-associated gene sequence information.<n>We evaluate the model on benchmark datasets of GI disease-related targets.<n>Results confirm that incorporating both structural and genetic features yields more accurate predictions of binding affinities.
- Score: 2.1880525779004563
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate prediction of protein-ligand binding affinity plays a pivotal role in accelerating the discovery of novel drugs and vaccines, particularly for gastrointestinal (GI) diseases such as gastric ulcers, Crohn's disease, and ulcerative colitis. Traditional computational models often rely on structural information alone and thus fail to capture the genetic determinants that influence disease mechanisms and therapeutic responses. To address this gap, we propose GastroDL-Fusion, a dual-modal deep learning framework that integrates protein-ligand complex data with disease-associated gene sequence information for drug and vaccine development. In our approach, protein-ligand complexes are represented as molecular graphs and modeled using a Graph Isomorphism Network (GIN), while gene sequences are encoded into biologically meaningful embeddings via a pre-trained Transformer (ProtBERT/ESM). These complementary modalities are fused through a multi-layer perceptron to enable robust cross-modal interaction learning. We evaluate the model on benchmark datasets of GI disease-related targets, demonstrating that GastroDL-Fusion significantly improves predictive performance over conventional methods. Specifically, the model achieves a mean absolute error (MAE) of 1.12 and a root mean square error (RMSE) of 1.75, outperforming CNN, BiLSTM, GIN, and Transformer-only baselines. These results confirm that incorporating both structural and genetic features yields more accurate predictions of binding affinities, providing a reliable computational tool for accelerating the design of targeted therapies and vaccines in the context of gastrointestinal diseases.
Related papers
- Graph Attention Based Prioritization of Disease Responsible Genes from Multimodal Alzheimer's Network [20.37811669228711]
Prioritizing disease-associated genes is central to understanding complex disorders such as Alzheimer's disease.<n>We propose NETRA, a multimodal graph transformer framework that replaces centrality metrics with attention-driven relevance scoring.<n>A graph transformer assigns NETRA scores that quantify gene relevance in a disease-specific and context-aware manner.
arXiv Detail & Related papers (2026-03-01T06:46:18Z) - MethConvTransformer: A Deep Learning Framework for Cross-Tissue Alzheimer's Disease Detection [4.931890971425293]
Alzheimer's disease (AD) is a multifactorial neurodegenerative disorder characterized by progressive cognitive decline and widespread dysregulation in the brain.<n>MethConvTransformer is a transformer-based deep learning framework that integrates DNA methylation profiles from both brain and peripheral tissues.
arXiv Detail & Related papers (2026-01-01T00:18:33Z) - Interpretable Perturbation Modeling Through Biomedical Knowledge Graphs [2.9275990558029075]
multimodal embeddings are integrated into biomedical knowledge graphs.<n>We train a graph attention network to learn the delta expression profile of landmark genes for a given drug-cell pair.<n>Our framework provides a path toward mechanistic drug modeling.
arXiv Detail & Related papers (2025-12-24T04:42:25Z) - R-GenIMA: Integrating Neuroimaging and Genetics with Interpretable Multimodal AI for Alzheimer's Disease Progression [63.97617759805451]
Early detection of Alzheimer's disease requires models capable of integrating macro-scale neuroanatomical alterations with micro-scale genetic susceptibility.<n>We introduce R-GenIMA, an interpretable multimodal large language model that couples a novel ROI-wise vision transformer with genetic prompting.<n>R-GenIMA achieves state-of-the-art performance in four-way classification across normal cognition, subjective memory concerns, mild cognitive impairment, and AD.
arXiv Detail & Related papers (2025-12-22T02:54:10Z) - Modeling Dabrafenib Response Using Multi-Omics Modality Fusion and Protein Network Embeddings Based on Graph Convolutional Networks [0.0]
Cancer cell response to targeted therapy arises from complex molecular interactions, making single omics insufficient for accurate prediction.<n>This study develops a model to predict Dabrafenib sensitivity by integrating multiple omics layers (genomics, transcriptomics, epigenomics, metabolomics) with protein network embeddings generated using Graph Convolutional Networks (GCN)<n>Results show that attention guided multi omics fusion combined with GCN improves drug response prediction and reveals complementary molecular determinants of Dabrafenib sensitivity.
arXiv Detail & Related papers (2025-12-13T02:00:56Z) - A Machine Learning Framework for Pathway-Driven Therapeutic Target Discovery in Metabolic Disorders [1.41678086736482]
This study introduces a novel machine learning (ML) framework that integrates predictive modeling with gene-agnostic pathway mapping to identify high-risk individuals.<n>Using the Pima Indian dataset, logistic regression and t-tests were applied to identify key predictors of T2DM, yielding an overall model accuracy of 78.43%.
arXiv Detail & Related papers (2025-09-14T19:29:52Z) - Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z) - KEPLA: A Knowledge-Enhanced Deep Learning Framework for Accurate Protein-Ligand Binding Affinity Prediction [60.23701115249195]
KEPLA is a novel deep learning framework that integrates prior knowledge from Gene Ontology and ligand properties to enhance prediction performance.<n> Experiments on two benchmark datasets demonstrate that KEPLA consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-06-16T08:02:42Z) - Interpretable Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification using Multi-Omics Data [36.92842246372894]
Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN) is a deep learning framework that utilizes messenger-RNA, micro-RNA sequences, and DNA methylation samples.<n>By integrating multi-omics data with graph-based deep learning, our proposed approach demonstrates robust predictive performance and interpretability.
arXiv Detail & Related papers (2025-03-29T02:14:05Z) - scGSDR: Harnessing Gene Semantics for Single-Cell Pharmacological Profiling [5.831554646284266]
scGSDR is a model that integrates two computational pipelines grounded in the knowledge of cellular states and gene signaling pathways.<n> scGSDR enhances predictive performance by incorporating gene semantics and employs an interpretability module.<n>The model's application has extended from single-drug predictions to scenarios involving drug combinations.
arXiv Detail & Related papers (2025-02-02T15:43:20Z) - Comprehensive Metapath-based Heterogeneous Graph Transformer for Gene-Disease Association Prediction [19.803593399456823]
COmprehensive MEtapath-based heterogeneous graph Transformer(COMET) for predicting gene-disease associations.<n>Our method demonstrates superior robustness compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-01-14T09:41:18Z) - Explainable AI model reveals disease-related mechanisms in single-cell RNA-seq data [2.975735171548829]
Neurodegenerative diseases (NDDs) are complex and lack effective treatment due to their poorly understood mechanism.<n>In this work, we implement a method for identifying disease-related genes and the mechanistic explanation of disease progression based on NN model combined with SHAP.<n>Our results show that DGE and SHAP approaches offer both common and differential sets of altered genes and pathways, reinforcing the usefulness of XAI methods for a broader perspective of disease.
arXiv Detail & Related papers (2025-01-07T16:35:29Z) - Gene-Metabolite Association Prediction with Interactive Knowledge Transfer Enhanced Graph for Metabolite Production [49.814615043389864]
We propose a new task, Gene-Metabolite Association Prediction based on metabolic graphs.
We present the first benchmark containing 2474 metabolites and 1947 genes of two commonly used microorganisms.
Our proposed methodology outperforms baselines by up to 12.3% across various link prediction frameworks.
arXiv Detail & Related papers (2024-10-24T06:54:27Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.