ExPath: Towards Explaining Targeted Pathways for Biological Knowledge Bases
- URL: http://arxiv.org/abs/2502.18026v1
- Date: Tue, 25 Feb 2025 09:33:15 GMT
- Title: ExPath: Towards Explaining Targeted Pathways for Biological Knowledge Bases
- Authors: Rikuto Kotoge, Ziwei Yang, Zheng Chen, Yushun Dong, Yasuko Matsubara, Jimeng Sun, Yasushi Sakurai,
- Abstract summary: We propose a novel pathway inference framework, ExPath, to classify various graphs (bio-networks) in biological databases.<n>ExPath comprises three components: (1) a large protein language model (pLM) that encodes and embeds AA-seqs into graph, overcoming traditional obstacles in processing AA-seq data; (2) PathMamba, a hybrid architecture combining graph neural networks (GNNs) with state-space sequence modeling (Mamba) to capture both local interactions and global pathway-level dependencies; and (3) PathExplainer, a subgraph learning module that identifies functionally critical nodes and edges through train
- Score: 36.89299758497499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biological knowledge bases provide systemically functional pathways of cells or organisms in terms of molecular interaction. However, recognizing more targeted pathways, particularly when incorporating wet-lab experimental data, remains challenging and typically requires downstream biological analyses and expertise. In this paper, we frame this challenge as a solvable graph learning and explaining task and propose a novel pathway inference framework, ExPath, that explicitly integrates experimental data, specifically amino acid sequences (AA-seqs), to classify various graphs (bio-networks) in biological databases. The links (representing pathways) that contribute more to classification can be considered as targeted pathways. Technically, ExPath comprises three components: (1) a large protein language model (pLM) that encodes and embeds AA-seqs into graph, overcoming traditional obstacles in processing AA-seq data, such as BLAST; (2) PathMamba, a hybrid architecture combining graph neural networks (GNNs) with state-space sequence modeling (Mamba) to capture both local interactions and global pathway-level dependencies; and (3) PathExplainer, a subgraph learning module that identifies functionally critical nodes and edges through trainable pathway masks. We also propose ML-oriented biological evaluations and a new metric. The experiments involving 301 bio-networks evaluations demonstrate that pathways inferred by ExPath maintain biological meaningfulness. We will publicly release curated 301 bio-network data soon.
Related papers
- BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology [0.9603373981832565]
BioX-CPath is an explainable graph neural network architecture for whole slide image (WSI) classification.
At its core, BioX-CPath introduces a novel Stain-Aware Attention Pooling (SAAP) module that generates biologically meaningful, stain-aware patient embeddings.
arXiv Detail & Related papers (2025-03-26T18:00:22Z) - From Pixels to Histopathology: A Graph-Based Framework for Interpretable Whole Slide Image Analysis [81.19923502845441]
We develop a graph-based framework that constructs WSI graph representations.
We build tissue representations (nodes) that follow biological boundaries rather than arbitrary patches.
In our method's final step, we solve the diagnostic task through a graph attention network.
arXiv Detail & Related papers (2025-03-14T20:15:04Z) - PathVG: A New Benchmark and Dataset for Pathology Visual Grounding [45.21597220882424]
We propose a new benchmark called Pathology Visual Grounding (PathVG), which aims to detect regions based on expressions with different attributes.
In the experimental study, we found that the biggest challenge was the implicit information underlying the pathological expressions.
The proposed method achieves state-of-the-art performance on the PathVG benchmark.
arXiv Detail & Related papers (2025-02-28T09:13:01Z) - BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning [49.487327661584686]
We introduce BioMaze, a dataset with 5.1K complex pathway problems from real research.<n>Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning.<n>To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation.
arXiv Detail & Related papers (2025-02-23T17:38:10Z) - Progress and Opportunities of Foundation Models in Bioinformatics [77.74411726471439]
Foundations models (FMs) have ushered in a new era in computational biology, especially in the realm of deep learning.
Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs.
Review analyses challenges and limitations faced by FMs in biology, such as data noise, model explainability, and potential biases.
arXiv Detail & Related papers (2024-02-06T02:29:17Z) - Graph algorithms for predicting subcellular localization at the pathway
level [1.370633147306388]
We develop graph algorithms to predict the localization of all interactions in a biological pathway as an edge-labeling task.
We also perform a case study where we construct biological pathways and predict localizations of human fibroblasts undergoing viral infection.
arXiv Detail & Related papers (2022-12-12T15:49:43Z) - Neural Multi-Hop Reasoning With Logical Rules on Biomedical Knowledge
Graphs [10.244651735862627]
We conduct an empirical study based on the real-world task of drug repurposing.
We formulate this task as a link prediction problem where both compounds and diseases correspond to entities in a knowledge graph.
We propose a new method, PoLo, that combines policy-guided walks based on reinforcement learning with logical rules.
arXiv Detail & Related papers (2021-03-18T16:46:11Z) - Heterogeneous Graph based Deep Learning for Biomedical Network Link
Prediction [7.628651624423363]
We propose a Graph Pair based Link Prediction model (GPLP) for predicting biomedical network links.
InP, 1-hop subgraphs extracted from known network interaction matrix is learnt to predict missing links.
Our method demonstrates the potential applications in other biomedical networks.
arXiv Detail & Related papers (2021-01-28T07:35:29Z) - Learning the Implicit Semantic Representation on Graph-Structured Data [57.670106959061634]
Existing representation learning methods in graph convolutional networks are mainly designed by describing the neighborhood of each node as a perceptual whole.
We propose a Semantic Graph Convolutional Networks (SGCN) that explores the implicit semantics by learning latent semantic-paths in graphs.
arXiv Detail & Related papers (2021-01-16T16:18:43Z) - Knowledge-Guided Multi-Label Few-Shot Learning for General Image
Recognition [75.44233392355711]
KGGR framework exploits prior knowledge of statistical label correlations with deep neural networks.
It first builds a structured knowledge graph to correlate different labels based on statistical label co-occurrence.
Then, it introduces the label semantics to guide learning semantic-specific features.
It exploits a graph propagation network to explore graph node interactions.
arXiv Detail & Related papers (2020-09-20T15:05:29Z) - GCN for HIN via Implicit Utilization of Attention and Meta-paths [104.24467864133942]
Heterogeneous information network (HIN) embedding aims to map the structure and semantic information in a HIN to distributed representations.
We propose a novel neural network method via implicitly utilizing attention and meta-paths.
We first use the multi-layer graph convolutional network (GCN) framework, which performs a discriminative aggregation at each layer.
We then give an effective relaxation and improvement via introducing a new propagation operation which can be separated from aggregation.
arXiv Detail & Related papers (2020-07-06T11:09:40Z) - Inferring Signaling Pathways with Probabilistic Programming [1.8275108630751837]
We implement our method, named Sparse Signaling Pathway Sampling, in Julia using the Gen probabilistic programming language.
We evaluate our algorithm on simulated data and the HPN-DREAM pathway reconstruction challenge.
Our results demonstrate the vast potential for probabilistic programming, and Gen specifically, for biological network inference.
arXiv Detail & Related papers (2020-05-28T14:55:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.