Causal Inference, Biomarker Discovery, Graph Neural Network, Feature Selection
- URL: http://arxiv.org/abs/2511.13295v1
- Date: Mon, 17 Nov 2025 12:16:20 GMT
- Title: Causal Inference, Biomarker Discovery, Graph Neural Network, Feature Selection
- Authors: Chaowang Lan, Jingxin Wu, Yulong Yuan, Chuxun Liu, Huangyi Kang, Caihua Liu,
- Abstract summary: We develop a causal graph neural network (Causal-GNN) method that integrates causal inference with multi-layer graph neural networks (GNNs)<n>Our work provides a robust, efficient, and biologically interpretable tool for biomarker discovery, demonstrating strong potential for broad application across medical disciplines.
- Score: 2.2914260353572513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biomarker discovery from high-throughput transcriptomic data is crucial for advancing precision medicine. However, existing methods often neglect gene-gene regulatory relationships and lack stability across datasets, leading to conflation of spurious correlations with genuine causal effects. To address these issues, we develop a causal graph neural network (Causal-GNN) method that integrates causal inference with multi-layer graph neural networks (GNNs). The key innovation is the incorporation of causal effect estimation for identifying stable biomarkers, coupled with a GNN-based propensity scoring mechanism that leverages cross-gene regulatory networks. Experimental results demonstrate that our method achieves consistently high predictive accuracy across four distinct datasets and four independent classifiers. Moreover, it enables the identification of more stable biomarkers compared to traditional methods. Our work provides a robust, efficient, and biologically interpretable tool for biomarker discovery, demonstrating strong potential for broad application across medical disciplines.
Related papers
- Biology-informed neural networks learn nonlinear representations from omics data to improve genomic prediction and interpretability [0.0]
We extend biologically-informed neural networks (BINNs) for genomic prediction (GP) and selection (GS) in crops.<n>BINNs overcome limitations by encoding pathway-level inductive biases and leveraging multi-omics data only during training.<n>With complete domain knowledge for a synthetic metabolomics benchmark, BINN reduces prediction error by 75% relative to conventional neural nets.
arXiv Detail & Related papers (2025-10-16T17:59:38Z) - E-ABIN: an Explainable module for Anomaly detection in BIological Networks [1.7999333451993955]
E-ABIN is a general-purpose, explainable framework for Anomaly detection in Biological Networks.<n>It combines classical machine learning and graph-based deep learning techniques within a unified, user-friendly platform.<n>We demonstrate the utility of E-ABIN through case studies of bladder cancer and coeliac disease.
arXiv Detail & Related papers (2025-06-25T08:25:17Z) - Interpretable Graph Kolmogorov-Arnold Networks for Multi-Cancer Classification and Biomarker Identification using Multi-Omics Data [36.92842246372894]
Multi-Omics Graph Kolmogorov-Arnold Network (MOGKAN) is a deep learning framework that utilizes messenger-RNA, micro-RNA sequences, and DNA methylation samples.<n>By integrating multi-omics data with graph-based deep learning, our proposed approach demonstrates robust predictive performance and interpretability.
arXiv Detail & Related papers (2025-03-29T02:14:05Z) - Identifying Critical Phases for Disease Onset with Sparse Haematological Biomarkers [0.0]
Clinical blood tests are an emerging molecular data source for large-scale biomedical research.<n>Traditional imputation approaches distort learning signals and bias predictions while lacking biological interpretability.<n>We propose a novel methodology using Graph Neural Additive Networks (GNAN) to model delta biomarker trajectories.
arXiv Detail & Related papers (2025-03-18T07:29:45Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Large-Scale Targeted Cause Discovery via Learning from Simulated Data [66.51307552703685]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.<n>We train a neural network using supervised learning on simulated data to infer causality.<n> Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - Discovering robust biomarkers of psychiatric disorders from resting-state functional MRI via graph neural networks: A systematic review [4.799269666410891]
We review how GNN and model explainability techniques have been applied to fMRI datasets for disorder prediction tasks.<n>We identify 65 studies using GNNs that reported potential fMRI biomarkers for psychiatric disorders.<n>We suggest establishing new standards that are based on objective evaluation metrics to determine the robustness of potential biomarkers.
arXiv Detail & Related papers (2024-05-01T15:29:55Z) - Highly Accurate Disease Diagnosis and Highly Reproducible Biomarker
Identification with PathFormer [32.26944736442376]
Graph neural networks (GNNs) have been the dominant deep learning model for analyzing graph-structured data.
The root of the challenges is the unique graph structure of biological signaling pathways.
We present a novel GNN model architecture, named PathFormer, which integrates signaling network, priori knowledge and omics data to rank biomarkers and predict disease diagnosis.
arXiv Detail & Related papers (2024-02-11T18:23:54Z) - Learning to Denoise Biomedical Knowledge Graph for Robust Molecular Interaction Prediction [50.7901190642594]
We propose BioKDN (Biomedical Knowledge Graph Denoising Network) for robust molecular interaction prediction.
BioKDN refines the reliable structure of local subgraphs by denoising noisy links in a learnable manner.
It maintains consistent and robust semantics by smoothing relations around the target interaction.
arXiv Detail & Related papers (2023-12-09T07:08:00Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - CausalBench: A Large-scale Benchmark for Network Inference from
Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data.
CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Quantifying the Reproducibility of Graph Neural Networks using
Multigraph Brain Data [0.0]
Graph neural networks (GNNs) have witnessed an unprecedented proliferation in tackling several problems in computer vision, computer-aided diagnosis, and related fields.
While prior studies have focused on boosting the model accuracy, quantifying the most discriminative features identified by GNNs is still an intact problem that yields concerns about their reliability in clinical applications in particular.
We propose for the first time, a framework for GNN assessment via the most discriminative features (i.e., biomarkers) shared between different models. To ascertain the soundness of our framework, the assessment embraces variations of different factors such as training strategies and
arXiv Detail & Related papers (2021-09-06T05:31:02Z) - Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We introduce a novel approach to high-dimensional binary classification that integrates regularization with ensembling techniques.<n>In medical genomics applications, our approach identifies critical biomarkers overlooked by competing methods.
arXiv Detail & Related papers (2021-02-17T05:57:26Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.