GeneQuery: A General QA-based Framework for Spatial Gene Expression Predictions from Histology Images
- URL: http://arxiv.org/abs/2411.18391v1
- Date: Wed, 27 Nov 2024 14:33:13 GMT
- Title: GeneQuery: A General QA-based Framework for Spatial Gene Expression Predictions from Histology Images
- Authors: Ying Xiong, Linjing Liu, Yufei Cui, Shangyu Wu, Xue Liu, Antoni B. Chan, Chun Jason Xue,
- Abstract summary: Whole-slide hematoxylin and eosin stained histological images are readily accessible and allow for detailed examinations of tissue structure and composition at the microscopic level.
Recent advancements have utilized these histological images to predict spatially resolved gene expression profiles.
GeneQuery aims to solve this gene expression prediction task in a question-answering (QA) manner for better generality and flexibility.
- Score: 41.732831871866516
- License:
- Abstract: Gene expression profiling provides profound insights into molecular mechanisms, but its time-consuming and costly nature often presents significant challenges. In contrast, whole-slide hematoxylin and eosin (H&E) stained histological images are readily accessible and allow for detailed examinations of tissue structure and composition at the microscopic level. Recent advancements have utilized these histological images to predict spatially resolved gene expression profiles. However, state-of-the-art works treat gene expression prediction as a multi-output regression problem, where each gene is learned independently with its own weights, failing to capture the shared dependencies and co-expression patterns between genes. Besides, existing works can only predict gene expression values for genes seen during training, limiting their ability to generalize to new, unseen genes. To address the above limitations, this paper presents GeneQuery, which aims to solve this gene expression prediction task in a question-answering (QA) manner for better generality and flexibility. Specifically, GeneQuery takes gene-related texts as queries and whole-slide images as contexts and then predicts the queried gene expression values. With such a transformation, GeneQuery can implicitly estimate the gene distribution by introducing the gene random variable. Besides, the proposed GeneQuery consists of two architecture implementations, i.e., spot-aware GeneQuery for capturing patterns between images and gene-aware GeneQuery for capturing patterns between genes. Comprehensive experiments on spatial transcriptomics datasets show that the proposed GeneQuery outperforms existing state-of-the-art methods on known and unseen genes. More results also demonstrate that GeneQuery can potentially analyze the tissue structure.
Related papers
- Learning to Discover Regulatory Elements for Gene Expression Prediction [59.470991831978516]
Seq2Exp is a Sequence to Expression network designed to discover and extract regulatory elements that drive target gene expression.
Our approach captures the causal relationship between epigenomic signals, DNA sequences and their associated regulatory elements.
arXiv Detail & Related papers (2025-02-19T03:25:49Z) - Cross-Attention Graph Neural Networks for Inferring Gene Regulatory Networks with Skewed Degree Distribution [9.919024883502322]
Cross-Attention Complex Dual Graph Embedding Model (XATGRN)
Our model consistently outperforms existing state-of-the-art methods across various datasets.
arXiv Detail & Related papers (2024-12-18T10:56:40Z) - Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification [119.13058298388101]
We develop a Biological-knowledge enhanced PathGenomic multi-label Transformer to improve genetic mutation prediction performances.
BPGT first establishes a novel gene encoder that constructs gene priors by two carefully designed modules.
BPGT then designs a label decoder that finally performs genetic mutation prediction by two tailored modules.
arXiv Detail & Related papers (2024-06-05T06:42:27Z) - VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling [60.91599380893732]
VQDNA is a general-purpose framework that renovates genome tokenization from the perspective of genome vocabulary learning.
By leveraging vector-quantized codebooks as learnable vocabulary, VQDNA can adaptively tokenize genomes into pattern-aware embeddings.
arXiv Detail & Related papers (2024-05-13T20:15:03Z) - ProtiGeno: a prokaryotic short gene finder using protein language models [1.2354076490479513]
Current gene finders are highly sensitive in finding long genes, but their sensitivity decreases noticeably in finding shorter genes.
We develop a deep learning-based method called ProtiGeno, specifically targeting short prokaryotic genes.
In systematic large-scale experiments on 4,288 prokaryotic genomes, we demonstrate that ProtiGeno predicts short coding and noncoding genes with higher accuracy and recall than the current state-of-the-art gene finders.
arXiv Detail & Related papers (2023-07-19T16:46:42Z) - Epigenetics Algorithms: Self-Reinforcement-Attention mechanism to
regulate chromosomes expression [0.0]
This paper proposes a new epigenetics algorithm that mimics the epigenetics phenomenon known as methylation.
The novelty of our epigenetics algorithms lies primarily in taking advantage of attention mechanisms and deep learning, which fits well with the genes/silencing concept.
arXiv Detail & Related papers (2023-03-15T21:33:21Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Gene Function Prediction with Gene Interaction Networks: A Context Graph
Kernel Approach [24.234645183601998]
We propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions.
In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs.
arXiv Detail & Related papers (2022-04-22T02:54:01Z) - SimpleChrome: Encoding of Combinatorial Effects for Predicting Gene
Expression [8.326669256957352]
We present SimpleChrome, a deep learning model that learns the histone modification representations of genes.
The features learned from the model allow us to better understand the latent effects of cross-gene interactions and direct gene regulation on the target gene expression.
arXiv Detail & Related papers (2020-12-15T23:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.