CausalGeD: Blending Causality and Diffusion for Spatial Gene Expression Generation
- URL: http://arxiv.org/abs/2502.07751v1
- Date: Tue, 11 Feb 2025 18:26:22 GMT
- Title: CausalGeD: Blending Causality and Diffusion for Spatial Gene Expression Generation
- Authors: Rabeya Tus Sadia, Md Atik Ahamed, Qiang Cheng,
- Abstract summary: We present CausalGeD, which combines diffusion and autoregressive processes to leverage causal relationships between genes.<n>Across 10 tissue datasets, CausalGeD outperformed state-of-the-art baselines by 5- 32% in key metrics.
- Score: 11.664880068737084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The integration of single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) data is crucial for understanding gene expression in spatial context. Existing methods for such integration have limited performance, with structural similarity often below 60\%, We attribute this limitation to the failure to consider causal relationships between genes. We present CausalGeD, which combines diffusion and autoregressive processes to leverage these relationships. By generalizing the Causal Attention Transformer from image generation to gene expression data, our model captures regulatory mechanisms without predefined relationships. Across 10 tissue datasets, CausalGeD outperformed state-of-the-art baselines by 5- 32\% in key metrics, including Pearson's correlation and structural similarity, advancing both technical and biological insights.
Related papers
- Graph Attention Based Prioritization of Disease Responsible Genes from Multimodal Alzheimer's Network [20.37811669228711]
Prioritizing disease-associated genes is central to understanding complex disorders such as Alzheimer's disease.<n>We propose NETRA, a multimodal graph transformer framework that replaces centrality metrics with attention-driven relevance scoring.<n>A graph transformer assigns NETRA scores that quantify gene relevance in a disease-specific and context-aware manner.
arXiv Detail & Related papers (2026-03-01T06:46:18Z) - Beyond Independent Genes: Learning Module-Inductive Representations for Gene Perturbation Prediction [48.80217316452559]
scBIG is a module-inductive prediction framework that explicitly models coordinated gene programs.<n> scBIG consistently outperforms state-of-the-art methods, particularly on unseen and perturbation settings.
arXiv Detail & Related papers (2026-02-03T16:43:40Z) - TRIDENT: A Trimodal Cascade Generative Framework for Drug and RNA-Conditioned Cellular Morphology Synthesis [56.9460577864211]
TRIDENT is a cascade generative framework that synthesizes realistic cellular morphology by conditioning on both the perturbation and the corresponding gene expression profile.<n> TRIDENT significantly outperforms state-of-the-art approaches, achieving up to 7-fold improvement with strong generalization to unseen compounds.
arXiv Detail & Related papers (2025-11-23T04:43:27Z) - GenAR: Next-Scale Autoregressive Generation for Spatial Gene Expression Prediction [15.143858141542532]
GenAR is a multi-scale autoregressive framework that refines predictions from coarse to fine.<n>We introduce GenAR, a multi-scale autoregressive framework that refines predictions from coarse to fine.<n>GenAR achieves principled state-of-the-art performance, offering potential implications for precision medicine and cost-effective molecular profiling.
arXiv Detail & Related papers (2025-10-05T18:28:21Z) - Enhanced Single-Cell RNA-seq Embedding through Gene Expression and Data-Driven Gene-Gene Interaction Integration [0.05156484100374057]
We present a novel embedding approach that integrates both gene expression profiles and data-driven gene-gene interactions.<n>By incorporating both expression levels and gene-gene interactions, our approach provides a more comprehensive representation of cellular states.
arXiv Detail & Related papers (2025-09-01T21:19:27Z) - Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations [44.619690829431214]
We train a neural network to predict distributional responses in gene expression following genetic perturbations.<n>Our model predicts gene-level histograms conditioned on perturbations and outperforms baselines in capturing higher-order statistics.
arXiv Detail & Related papers (2025-07-01T06:04:28Z) - Spatially Gene Expression Prediction using Dual-Scale Contrastive Learning [12.35331063443348]
NH2ST integrates spatial context and both pathology and gene modalities for gene expression prediction.<n>Our model consistently outperforms existing methods, achieving over 20% in PCC metrics.
arXiv Detail & Related papers (2025-06-30T13:18:39Z) - Unlasting: Unpaired Single-Cell Multi-Perturbation Estimation by Dual Conditional Diffusion Implicit Bridges [68.98973318553983]
We propose a framework based on Dual Diffusion Implicit Bridges (DDIB) to learn the mapping between different data distributions.<n>We integrate gene regulatory network (GRN) information to propagate perturbation signals in a biologically meaningful way.<n>We also incorporate a masking mechanism to predict silent genes, improving the quality of generated profiles.
arXiv Detail & Related papers (2025-06-26T09:05:38Z) - GRAPE: Heterogeneous Graph Representation Learning for Genetic Perturbation with Coding and Non-Coding Biotype [51.58774936662233]
Building gene regulatory networks (GRN) is essential to understand and predict the effects of genetic perturbations.<n>In this work, we leverage pre-trained large language model and DNA sequence model to extract features from gene descriptions and DNA sequence data.<n>We introduce gene biotype information for the first time in genetic perturbation, simulating the distinct roles of genes with different biotypes in regulating cellular processes.
arXiv Detail & Related papers (2025-05-06T03:35:24Z) - Learning to Discover Regulatory Elements for Gene Expression Prediction [59.470991831978516]
Seq2Exp is a Sequence to Expression network designed to discover and extract regulatory elements that drive target gene expression.
Our approach captures the causal relationship between epigenomic signals, DNA sequences and their associated regulatory elements.
arXiv Detail & Related papers (2025-02-19T03:25:49Z) - GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.<n>The model adheres to the central dogma of molecular biology, accurately generating protein-coding sequences.<n>It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of promoter sequences.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - Boundary-Guided Learning for Gene Expression Prediction in Spatial Transcriptomics [7.763803040383128]
We propose a framework named BG-TRIPLEX, which leverages boundary information extracted from pathological images as guiding features to enhance gene expression prediction.
Our framework consistently outperforms existing methods in terms of Pearson Correlation Coefficient (PCC)
This method highlights the crucial role of boundary features in understanding the complex interactions between WSI and gene expression.
arXiv Detail & Related papers (2024-12-05T11:09:11Z) - RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency [11.813883157319381]
We propose a novel framework that aligns gene and image features using a ranking-based alignment loss.
To further enhance the alignment's stability, we employ self-supervised knowledge distillation with a teacher-student network architecture.
arXiv Detail & Related papers (2024-11-22T17:08:28Z) - Semantically Rich Local Dataset Generation for Explainable AI in Genomics [0.716879432974126]
Black box deep learning models trained on genomic sequences excel at predicting the outcomes of different gene regulatory mechanisms.
We propose using Genetic Programming to generate datasets by evolving perturbations in sequences that contribute to their semantic diversity.
arXiv Detail & Related papers (2024-07-03T10:31:30Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - FGBERT: Function-Driven Pre-trained Gene Language Model for Metagenomics [46.189419603576084]
FGBERT is a novel metagenomic pre-trained model that employs a protein-based gene representation as a context-aware tokenizer.<n>It demonstrates superior performance on metagenomic datasets at four levels, spanning gene, functional, bacterial, and environmental levels.
arXiv Detail & Related papers (2024-02-24T13:13:17Z) - GENER: A Parallel Layer Deep Learning Network To Detect Gene-Gene
Interactions From Gene Expression Data [0.7660368798066375]
We introduce a parallel-layer deep learning network designed exclusively for the identification of gene-gene relationships using gene expression data.
Our model achieved an average AUROC score of 0.834 on the combined BioGRID&DREAM5 dataset, outperforming competing methods in predicting gene-gene interactions.
arXiv Detail & Related papers (2023-10-05T15:45:53Z) - DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with
GFlowNets [81.75973217676986]
Gene regulatory networks (GRN) describe interactions between genes and their products that control gene expression and cellular function.
Existing methods either focus on challenge (1), identifying cyclic structure from dynamics, or on challenge (2) learning complex Bayesian posteriors over DAGs, but not both.
In this paper we leverage the fact that it is possible to estimate the "velocity" of gene expression with RNA velocity techniques to develop an approach that addresses both challenges.
arXiv Detail & Related papers (2023-02-08T16:36:40Z) - Granger causal inference on DAGs identifies genomic loci regulating
transcription [77.58911272503771]
GrID-Net is a framework based on graph neural networks with lagged message passing for Granger causal inference on DAG-structured systems.
Our application is the analysis of single-cell multimodal data to identify genomic loci that mediate the regulation of specific genes.
arXiv Detail & Related papers (2022-10-18T21:15:10Z) - Gene Function Prediction with Gene Interaction Networks: A Context Graph
Kernel Approach [24.234645183601998]
We propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions.
In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs.
arXiv Detail & Related papers (2022-04-22T02:54:01Z) - Conditional Hybrid GAN for Sequence Generation [56.67961004064029]
We propose a novel conditional hybrid GAN (C-Hybrid-GAN) to solve this issue.
We exploit the Gumbel-Softmax technique to approximate the distribution of discrete-valued sequences.
We demonstrate that the proposed C-Hybrid-GAN outperforms the existing methods in context-conditioned discrete-valued sequence generation.
arXiv Detail & Related papers (2020-09-18T03:52:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.