Distance-Preserving Spatial Representations in Genomic Data
- URL: http://arxiv.org/abs/2408.00911v2
- Date: Sat, 04 Jan 2025 16:49:17 GMT
- Title: Distance-Preserving Spatial Representations in Genomic Data
- Authors: Wenbin Zhou, Jin-Hong Du,
- Abstract summary: The spatial context of single-cell gene expression data is crucial for many downstream analyses, yet often remains inaccessible due to practical and technical limitations.
We propose a generic representation learning and transfer learning framework dp-VAE, capable of reconstructing the spatial coordinates associated with the provided gene expression data.
- Score: 0.0
- License:
- Abstract: The spatial context of single-cell gene expression data is crucial for many downstream analyses, yet often remains inaccessible due to practical and technical limitations, restricting the utility of such datasets. In this paper, we propose a generic representation learning and transfer learning framework dp-VAE, capable of reconstructing the spatial coordinates associated with the provided gene expression data. Central to our approach is a distance-preserving regularizer integrated into the loss function during training, ensuring the model effectively captures and utilizes spatial context signals from reference datasets. During the inference stage, the produced latent representation of the model can be used to reconstruct or impute the spatial context of the provided gene expression by solving a constrained optimization problem. We also explore the theoretical connections between distance-preserving loss, distortion, and the bi-Lipschitz condition within generative models. Finally, we demonstrate the effectiveness of dp-VAE in different tasks involving training robustness, out-of-sample evaluation, and transfer learning inference applications by testing it over 27 publicly available datasets. This underscores its applicability to a wide range of genomics studies that were previously hindered by the absence of spatial data.
Related papers
- Re-Visible Dual-Domain Self-Supervised Deep Unfolding Network for MRI Reconstruction [48.30341580103962]
We propose a novel re-visible dual-domain self-supervised deep unfolding network to address these issues.
We design a deep unfolding network based on Chambolle and Pock Proximal Point Algorithm (DUN-CP-PPA) to achieve end-to-end reconstruction.
Experiments conducted on the fastMRI and IXI datasets demonstrate that our method significantly outperforms state-of-the-art approaches in terms of reconstruction performance.
arXiv Detail & Related papers (2025-01-07T12:29:32Z) - stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation [8.211887623977214]
We introduce textbfstMCDI, a novel conditional diffusion model for spatial transcriptomics data imputation.
It employs a denoising network trained using randomly masked data portions as guidance, with the unmasked data serving as conditions.
The results obtained from spatial transcriptomics datasets elucidate the performance of our methods relative to existing approaches.
arXiv Detail & Related papers (2024-03-16T09:06:38Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - Generation of non-stationary stochastic fields using Generative
Adversarial Networks with limited training data [0.0]
In this work, we investigate the problem of training Generative Adversarial Networks (GANs) models against a dataset of geological channelized patterns.
The developed training method allowed for effective learning of the correlation between the spatial conditions.
Our models were able to generate geologically-plausible realizations beyond the training samples with a strong correlation with the target maps.
arXiv Detail & Related papers (2022-05-11T13:09:47Z) - Cyclic Graph Attentive Match Encoder (CGAME): A Novel Neural Network For
OD Estimation [8.398623478484248]
Origin-Destination Estimation plays an important role in traffic management and traffic simulation in the era of Intelligent Transportation System (ITS)
Previous model-based models face the under-determined challenge, thus desperate demand for additional assumptions and extra data exists.
We propose Cyclic Graph Attentive Matching (C-GAME) based on a novel Graph Matcher with double-layer attention mechanism.
arXiv Detail & Related papers (2021-11-26T08:57:21Z) - Evidential Sparsification of Multimodal Latent Spaces in Conditional
Variational Autoencoders [63.46738617561255]
We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder.
We use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not.
Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique.
arXiv Detail & Related papers (2020-10-19T01:27:21Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.