Interpretable Perturbation Modeling Through Biomedical Knowledge Graphs
- URL: http://arxiv.org/abs/2512.22251v2
- Date: Wed, 31 Dec 2025 17:30:56 GMT
- Title: Interpretable Perturbation Modeling Through Biomedical Knowledge Graphs
- Authors: Pascal Passigan, Kevin Zhu, Angelina Ning,
- Abstract summary: multimodal embeddings are integrated into biomedical knowledge graphs.<n>We train a graph attention network to learn the delta expression profile of landmark genes for a given drug-cell pair.<n>Our framework provides a path toward mechanistic drug modeling.
- Score: 2.9275990558029075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding how small molecules perturb gene expression is essential for uncovering drug mechanisms, predicting off-target effects, and identifying repurposing opportunities. While prior deep learning frameworks have integrated multimodal embeddings into biomedical knowledge graphs (BKGs) and further improved these representations through graph neural network message-passing paradigms, these models have been applied to tasks such as link prediction and binary drug-disease association, rather than the task of gene perturbation, which may unveil more about mechanistic transcriptomic effects. To address this gap, we construct a merged biomedical graph that integrates (i) PrimeKG++, an augmentation of PrimeKG containing semantically rich embeddings for nodes with (ii) LINCS L1000 drug and cell line nodes, initialized with multimodal embeddings from foundation models such as MolFormerXL and BioBERT. Using this heterogeneous graph, we train a graph attention network (GAT) with a downstream prediction head that learns the delta expression profile of over 978 landmark genes for a given drug-cell pair. Our results show that our framework outperforms MLP baselines for differentially expressed genes (DEG) -- which predict the delta expression given a concatenated embedding of drug features, target features, and baseline cell expression -- under the scaffold and random splits. Ablation experiments with edge shuffling and node feature randomization further demonstrate that the edges provided by biomedical KGs enhance perturbation-level prediction. More broadly, our framework provides a path toward mechanistic drug modeling: moving beyond binary drug-disease association tasks to granular transcriptional effects of therapeutic intervention.
Related papers
- MetagenBERT: a Transformer-based Architecture using Foundational genomic Large Language Models for novel Metagenome Representation [4.470992949474734]
We present MetagenBERT, a framework that produces end to end metagenome embeddings directly from raw DNA sequences without taxonomic or functional annotations.<n>We evaluate this approach on five benchmark gut microbiome datasets (Cirrhosis, T2D, Obesity, IBD, CRC)<n>We additionally introduce MetagenBERT Glob Mcardis, a cross cohort variant trained on the large, phenotypically diverse MetaCardis cohort and transferred to other datasets, retaining predictive signal including for unseen phenotypes.
arXiv Detail & Related papers (2026-01-05T19:36:36Z) - MOTGNN: Interpretable Graph Neural Networks for Multi-Omics Disease Classification [8.939868953031976]
We propose Multi-Omics integration with Tree-generated Graph Neural Network (MOTGNN), a novel and interpretable framework for binary disease classification.<n>MOTGNN employs eXtreme Gradient Boosting (XGBoost) to perform omics-specific supervised graph construction, followed by modality-specific Graph Neural Networks (GNNs) for hierarchical representation learning, and a deep feedforward network for cross-omics integration.
arXiv Detail & Related papers (2025-08-10T19:35:53Z) - Towards Interpretable Drug-Drug Interaction Prediction: A Graph-Based Approach with Molecular and Network-Level Explanations [3.6099926707292793]
Drug-drug interactions (DDIs) represent a critical challenge in pharmacology, often leading to adverse drug reactions with significant implications for patient safety and healthcare outcomes.<n>We propose MolecBioNet, a novel graph-based framework that integrates molecular and biomedical knowledge for robust and interpretable DDI prediction.
arXiv Detail & Related papers (2025-07-12T07:43:19Z) - Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z) - GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z) - Heterogeneous network drug-target interaction prediction model based on graph wavelet transform and multi-level contrastive learning [8.154286666697312]
This study proposes a heterogeneous network drug target interaction prediction framework.<n>It integrates graph neural network and multi scale signal processing technology to construct a model with both efficient prediction and multi level interpretability.<n> Experimental results show that our framework shows excellent prediction performance on all datasets.
arXiv Detail & Related papers (2025-04-27T09:29:50Z) - GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.<n>Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks.<n>It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - Gene-Metabolite Association Prediction with Interactive Knowledge Transfer Enhanced Graph for Metabolite Production [49.814615043389864]
We propose a new task, Gene-Metabolite Association Prediction based on metabolic graphs.
We present the first benchmark containing 2474 metabolites and 1947 genes of two commonly used microorganisms.
Our proposed methodology outperforms baselines by up to 12.3% across various link prediction frameworks.
arXiv Detail & Related papers (2024-10-24T06:54:27Z) - Learning to Denoise Biomedical Knowledge Graph for Robust Molecular Interaction Prediction [50.7901190642594]
We propose BioKDN (Biomedical Knowledge Graph Denoising Network) for robust molecular interaction prediction.
BioKDN refines the reliable structure of local subgraphs by denoising noisy links in a learnable manner.
It maintains consistent and robust semantics by smoothing relations around the target interaction.
arXiv Detail & Related papers (2023-12-09T07:08:00Z) - Tertiary Lymphoid Structures Generation through Graph-based Diffusion [54.37503714313661]
In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs.
We show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content.
arXiv Detail & Related papers (2023-10-10T14:37:17Z) - Hierarchical Graph Representation Learning for the Prediction of
Drug-Target Binding Affinity [7.023929372010717]
We propose a novel hierarchical graph representation learning model for the drug-target binding affinity prediction, namely HGRL-DTA.
In this paper, we adopt a message broadcasting mechanism to integrate the hierarchical representations learned from the global-level affinity graph and the local-level molecular graph. Besides, we design a similarity-based embedding map to solve the cold start problem of inferring representations for unseen drugs and targets.
arXiv Detail & Related papers (2022-03-22T04:50:16Z) - MOOMIN: Deep Molecular Omics Network for Anti-Cancer Drug Combination
Therapy [2.446672595462589]
We propose a multimodal graph neural network that can predict the synergistic effect of drug combinations for cancer treatment.
Our model captures the representation based on the context of drugs at multiple scales based on a drug-protein interaction network and metadata.
We demonstrate that the model makes high-quality predictions over a wide range of cancer cell line tissues.
arXiv Detail & Related papers (2021-10-28T13:10:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.