Distance-Preserving Generative Modeling of Spatial Transcriptomics
- URL: http://arxiv.org/abs/2408.00911v1
- Date: Thu, 1 Aug 2024 21:04:27 GMT
- Title: Distance-Preserving Generative Modeling of Spatial Transcriptomics
- Authors: Wenbin Zhou, Jin-Hong Du,
- Abstract summary: We introduce a class of distance-preserving generative models for spatial transcriptomics.
We use the provided spatial information to regularize the learned representation space of gene expressions to have a similar pair-wise distance structure.
Our framework grants compatibility with any variational-inference-based generative models for gene expression modeling.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spatial transcriptomics data is invaluable for understanding the spatial organization of gene expression in tissues. There have been consistent efforts in studying how to effectively utilize the associated spatial information for refining gene expression modeling. We introduce a class of distance-preserving generative models for spatial transcriptomics, which utilizes the provided spatial information to regularize the learned representation space of gene expressions to have a similar pair-wise distance structure. This helps the latent space to capture meaningful encodings of genes in spatial proximity. We carry out theoretical analysis over a tractable loss function for this purpose and formalize the overall learning objective as a regularized evidence lower bound. Our framework grants compatibility with any variational-inference-based generative models for gene expression modeling. Empirically, we validate our proposed method on the mouse brain tissues Visium dataset and observe improved performance with variational autoencoders and scVI used as backbone models.
Related papers
- Re-Visible Dual-Domain Self-Supervised Deep Unfolding Network for MRI Reconstruction [48.30341580103962]
We propose a novel re-visible dual-domain self-supervised deep unfolding network to address these issues.
We design a deep unfolding network based on Chambolle and Pock Proximal Point Algorithm (DUN-CP-PPA) to achieve end-to-end reconstruction.
Experiments conducted on the fastMRI and IXI datasets demonstrate that our method significantly outperforms state-of-the-art approaches in terms of reconstruction performance.
arXiv Detail & Related papers (2025-01-07T12:29:32Z) - RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency [11.813883157319381]
We propose a novel framework that aligns gene and image features using a ranking-based alignment loss.
To further enhance the alignment's stability, we employ self-supervised knowledge distillation with a teacher-student network architecture.
arXiv Detail & Related papers (2024-11-22T17:08:28Z) - What makes for good morphology representations for spatial omics? [1.4298574812790055]
We introduce a framework for categorizing spatial omics-morphology combination methods.
By translation we mean finding morphological features that spatially correlate with gene expression patterns.
By integration we mean finding morphological features that spatially complement gene expression patterns.
arXiv Detail & Related papers (2024-07-30T08:52:51Z) - Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View [49.03501451546763]
We identify the importance of implicit correspondences across biological contexts for exploiting domain-invariant pathological composition.
We propose self-adaptive dynamic distillation to secure instance-aware trade-offs across different model constituents.
arXiv Detail & Related papers (2024-07-14T04:41:16Z) - Semantically Rich Local Dataset Generation for Explainable AI in Genomics [0.716879432974126]
Black box deep learning models trained on genomic sequences excel at predicting the outcomes of different gene regulatory mechanisms.
We propose using Genetic Programming to generate datasets by evolving perturbations in sequences that contribute to their semantic diversity.
arXiv Detail & Related papers (2024-07-03T10:31:30Z) - VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling [60.91599380893732]
VQDNA is a general-purpose framework that renovates genome tokenization from the perspective of genome vocabulary learning.
By leveraging vector-quantized codebooks as learnable vocabulary, VQDNA can adaptively tokenize genomes into pattern-aware embeddings.
arXiv Detail & Related papers (2024-05-13T20:15:03Z) - stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation [8.211887623977214]
We introduce textbfstMCDI, a novel conditional diffusion model for spatial transcriptomics data imputation.
It employs a denoising network trained using randomly masked data portions as guidance, with the unmasked data serving as conditions.
The results obtained from spatial transcriptomics datasets elucidate the performance of our methods relative to existing approaches.
arXiv Detail & Related papers (2024-03-16T09:06:38Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - Generation of non-stationary stochastic fields using Generative
Adversarial Networks with limited training data [0.0]
In this work, we investigate the problem of training Generative Adversarial Networks (GANs) models against a dataset of geological channelized patterns.
The developed training method allowed for effective learning of the correlation between the spatial conditions.
Our models were able to generate geologically-plausible realizations beyond the training samples with a strong correlation with the target maps.
arXiv Detail & Related papers (2022-05-11T13:09:47Z) - Self-Supervised Graph Representation Learning for Neuronal Morphologies [75.38832711445421]
We present GraphDINO, a data-driven approach to learn low-dimensional representations of 3D neuronal morphologies from unlabeled datasets.
We show, in two different species and across multiple brain areas, that this method yields morphological cell type clusterings on par with manual feature-based classification by experts.
Our method could potentially enable data-driven discovery of novel morphological features and cell types in large-scale datasets.
arXiv Detail & Related papers (2021-12-23T12:17:47Z) - Cyclic Graph Attentive Match Encoder (CGAME): A Novel Neural Network For
OD Estimation [8.398623478484248]
Origin-Destination Estimation plays an important role in traffic management and traffic simulation in the era of Intelligent Transportation System (ITS)
Previous model-based models face the under-determined challenge, thus desperate demand for additional assumptions and extra data exists.
We propose Cyclic Graph Attentive Matching (C-GAME) based on a novel Graph Matcher with double-layer attention mechanism.
arXiv Detail & Related papers (2021-11-26T08:57:21Z) - Spatial machine-learning model diagnostics: a model-agnostic
distance-based approach [91.62936410696409]
This contribution proposes spatial prediction error profiles (SPEPs) and spatial variable importance profiles (SVIPs) as novel model-agnostic assessment and interpretation tools.
The SPEPs and SVIPs of geostatistical methods, linear models, random forest, and hybrid algorithms show striking differences and also relevant similarities.
The novel diagnostic tools enrich the toolkit of spatial data science, and may improve ML model interpretation, selection, and design.
arXiv Detail & Related papers (2021-11-13T01:50:36Z) - All You Need is Color: Image based Spatial Gene Expression Prediction
using Neural Stain Learning [11.9045433112067]
We propose a "stain-aware" machine learning approach for prediction of spatial transcriptomic gene expression profiles.
We have found that the gene expression predictions from the proposed approach show higher correlations with true expression values obtained through sequencing.
arXiv Detail & Related papers (2021-08-23T23:43:38Z) - Reprogramming Language Models for Molecular Representation Learning [65.00999660425731]
We propose Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained language models for molecular learning tasks.
The adversarial program learns a linear transformation between a dense source model input space (language data) and a sparse target model input space (e.g., chemical and biological molecule data) using a k-SVD solver.
R2DL achieves the baseline established by state of the art toxicity prediction models trained on domain-specific data and outperforms the baseline in a limited training-data setting.
arXiv Detail & Related papers (2020-12-07T05:50:27Z) - Evidential Sparsification of Multimodal Latent Spaces in Conditional
Variational Autoencoders [63.46738617561255]
We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder.
We use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not.
Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique.
arXiv Detail & Related papers (2020-10-19T01:27:21Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.