Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning
- URL: http://arxiv.org/abs/2505.04300v1
- Date: Wed, 07 May 2025 10:14:31 GMT
- Title: Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning
- Authors: Isabella Caranzano, Corrado Pancotti, Cesare Rollo, Flavio Sartori, Pietro Liò, Piero Fariselli, Tiziana Sanavia,
- Abstract summary: Biologically-informed neural networks typically leverage pathway annotations to enhance performance in biomedical applications.<n>We conducted a comprehensive analysis of all relevant pathway-based neural network models for predictive tasks.<n>Our findings suggest that pathway annotations may be too noisy or inadequately explored by current methods.
- Score: 12.24146000012622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biologically-informed neural networks typically leverage pathway annotations to enhance performance in biomedical applications. We hypothesized that the benefits of pathway integration does not arise from its biological relevance, but rather from the sparsity it introduces. We conducted a comprehensive analysis of all relevant pathway-based neural network models for predictive tasks, critically evaluating each study's contributions. From this review, we curated a subset of methods for which the source code was publicly available. The comparison of the biologically informed state-of-the-art deep learning models and their randomized counterparts showed that models based on randomized information performed equally well as biologically informed ones across different metrics and datasets. Notably, in 3 out of the 15 analyzed models, the randomized versions even outperformed their biologically informed counterparts. Moreover, pathway-informed models did not show any clear advantage in interpretability, as randomized models were still able to identify relevant disease biomarkers despite lacking explicit pathway information. Our findings suggest that pathway annotations may be too noisy or inadequately explored by current methods. Therefore, we propose a methodology that can be applied to different domains and can serve as a robust benchmark for systematically comparing novel pathway-informed models against their randomized counterparts. This approach enables researchers to rigorously determine whether observed performance improvements can be attributed to biological insights.
Related papers
- Flow Matching Meets Biology and Life Science: A Survey [65.2146737141455]
Flow matching has emerged as a powerful and efficient alternative to diffusion-based generative modeling.<n>This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains.
arXiv Detail & Related papers (2025-07-23T17:44:29Z) - BioMaze: Benchmarking and Enhancing Large Language Models for Biological Pathway Reasoning [49.487327661584686]
We introduce BioMaze, a dataset with 5.1K complex pathway problems from real research.<n>Our evaluation of methods such as CoT and graph-augmented reasoning, shows that LLMs struggle with pathway reasoning.<n>To address this, we propose PathSeeker, an LLM agent that enhances reasoning through interactive subgraph-based navigation.
arXiv Detail & Related papers (2025-02-23T17:38:10Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Neural Echos: Depthwise Convolutional Filters Replicate Biological
Receptive Fields [56.69755544814834]
We present evidence suggesting that depthwise convolutional kernels are effectively replicating the biological receptive fields observed in the mammalian retina.
We propose a scheme that draws inspiration from the biological receptive fields.
arXiv Detail & Related papers (2024-01-18T18:06:22Z) - Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioEL provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction.
We show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.
arXiv Detail & Related papers (2023-12-15T14:04:23Z) - Finding Interpretable Class-Specific Patterns through Efficient Neural
Search [43.454121220860564]
We propose a novel, inherently interpretable binary neural network architecture DNAPS that extracts differential patterns from data.
DiffNaps is scalable to hundreds of thousands of features and robust to noise.
We show on synthetic and real world data, including three biological applications, that, unlike its competitors, DiffNaps consistently yields accurate, succinct, and interpretable class descriptions.
arXiv Detail & Related papers (2023-12-07T14:09:18Z) - Conditionally Invariant Representation Learning for Disentangling
Cellular Heterogeneity [25.488181126364186]
This paper presents a novel approach that leverages domain variability to learn representations that are conditionally invariant to unwanted variability or distractors.
We apply our method to grand biological challenges, such as data integration in single-cell genomics.
Specifically, the proposed approach helps to disentangle biological signals from data biases that are unrelated to the target task or the causal explanation of interest.
arXiv Detail & Related papers (2023-07-02T12:52:41Z) - Incorporating Prior Knowledge in Deep Learning Models via Pathway
Activity Autoencoders [5.950889585409067]
We propose a novel prior-knowledge-based deep auto-encoding framework, PAAE, for RNA-seq data in cancer.
We show that, despite having access to a smaller set of features, our PAAE and PAVAE models achieve better out-of-set reconstruction results compared to common methodologies.
arXiv Detail & Related papers (2023-06-09T11:12:55Z) - Graph algorithms for predicting subcellular localization at the pathway
level [1.370633147306388]
We develop graph algorithms to predict the localization of all interactions in a biological pathway as an edge-labeling task.
We also perform a case study where we construct biological pathways and predict localizations of human fibroblasts undergoing viral infection.
arXiv Detail & Related papers (2022-12-12T15:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.