Related papers: Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders

Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders

URL: http://arxiv.org/abs/2501.10124v1
Date: Fri, 17 Jan 2025 11:27:58 GMT
Title: Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders
Authors: Gongxu Luo, Haoyue Dai, Boyang Sun, Loka Li, Biwei Huang, Petar Stojanov, Kun Zhang,
Abstract summary: Gene Regulatory Network Inference (GRNI) aims to identify causal relationships among genes using gene expression data.<n>Gene expression is influenced by latent confounders, such as non-coding RNAs, which add complexity to GRNI.<n>We propose GISL (Gene Regulatory Network Inference in the presence of Selection bias and Latent confounders) to infer true regulatory relationships in the presence of selection and confounding issues.
Score: 14.626706466908386
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gene Regulatory Network Inference (GRNI) aims to identify causal relationships among genes using gene expression data, providing insights into regulatory mechanisms. A significant yet often overlooked challenge is selection bias, a process where only cells meeting specific criteria, such as gene expression thresholds, survive or are observed, distorting the true joint distribution of genes and thus biasing GRNI results. Furthermore, gene expression is influenced by latent confounders, such as non-coding RNAs, which add complexity to GRNI. To address these challenges, we propose GISL (Gene Regulatory Network Inference in the presence of Selection bias and Latent confounders), a novel algorithm to infer true regulatory relationships in the presence of selection and confounding issues. Leveraging data obtained via multiple gene perturbation experiments, we show that the true regulatory relationships, as well as selection processes and latent confounders can be partially identified without strong parametric models and under mild graphical assumptions. Experimental results on both synthetic and real-world single-cell gene expression datasets demonstrate the superiority of GISL over existing methods.

Related papers

Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z)
Multi-omic Causal Discovery using Genotypes and Gene Expression [0.0]
We introduce GENESIS, a constraint-based causal algorithm to infer ancestral relationships in transcriptomic data.<n>By integrating genotypes as fixed causal anchors, GENESIS provides a principled head start'' to classical causal discovery algorithms.<n>This framework offers a powerful avenue for uncovering causal pathways in complex traits, with promising applications to functional genomics, drug discovery, and precision medicine.
arXiv Detail & Related papers (2025-05-21T11:52:23Z)
GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z)
Learning to Discover Regulatory Elements for Gene Expression Prediction [59.470991831978516]
Seq2Exp is a Sequence to Expression network designed to discover and extract regulatory elements that drive target gene expression. Our approach captures the causal relationship between epigenomic signals, DNA sequences and their associated regulatory elements.
arXiv Detail & Related papers (2025-02-19T03:25:49Z)
GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters. Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks. It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z)
Cross-Attention Graph Neural Networks for Inferring Gene Regulatory Networks with Skewed Degree Distribution [9.919024883502322]
Cross-Attention Complex Dual Graph Embedding Model (XATGRN)<n>Our model consistently outperforms existing state-of-the-art methods across various datasets.
arXiv Detail & Related papers (2024-12-18T10:56:40Z)
Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning [10.44434676119443]
Inferring gene regulatory networks (GRNs) from single-cell RNA sequencing (scRNA-seq) data is a complex challenge. In this study, we tackle this challenge by leveraging the single-cell BERT-based pre-trained transformer model (scBERT) We introduce a novel joint graph learning approach that combines the rich contextual representations learned by single-cell language models with the structured knowledge encoded in GRNs.
arXiv Detail & Related papers (2024-07-25T16:42:08Z)
Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances. BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules. BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z)
GENER: A Parallel Layer Deep Learning Network To Detect Gene-Gene Interactions From Gene Expression Data [0.7660368798066375]
We introduce a parallel-layer deep learning network designed exclusively for the identification of gene-gene relationships using gene expression data. Our model achieved an average AUROC score of 0.834 on the combined BioGRID&DREAM5 dataset, outperforming competing methods in predicting gene-gene interactions.
arXiv Detail & Related papers (2023-10-05T15:45:53Z)
Genetic heterogeneity analysis using genetic algorithm and network science [2.6166087473624318]
Genome-wide association studies (GWAS) can identify disease susceptible genetic variables. Genetic variables intertwined with genetic effects often exhibit lower effect-size. This paper introduces a novel feature selection mechanism for GWAS, named Feature Co-selection Network (FCSNet)
arXiv Detail & Related papers (2023-08-12T01:28:26Z)
DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with GFlowNets [81.75973217676986]
Gene regulatory networks (GRN) describe interactions between genes and their products that control gene expression and cellular function. Existing methods either focus on challenge (1), identifying cyclic structure from dynamics, or on challenge (2) learning complex Bayesian posteriors over DAGs, but not both. In this paper we leverage the fact that it is possible to estimate the "velocity" of gene expression with RNA velocity techniques to develop an approach that addresses both challenges.
arXiv Detail & Related papers (2023-02-08T16:36:40Z)
Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles. It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner. These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z)
CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data. CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z)
Granger causal inference on DAGs identifies genomic loci regulating transcription [77.58911272503771]
GrID-Net is a framework based on graph neural networks with lagged message passing for Granger causal inference on DAG-structured systems. Our application is the analysis of single-cell multimodal data to identify genomic loci that mediate the regulation of specific genes.
arXiv Detail & Related papers (2022-10-18T21:15:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.