The Deep Generative Decoder: MAP estimation of representations improves
modeling of single-cell RNA data
- URL: http://arxiv.org/abs/2110.06672v3
- Date: Wed, 12 Jul 2023 14:13:24 GMT
- Title: The Deep Generative Decoder: MAP estimation of representations improves
modeling of single-cell RNA data
- Authors: Viktoria Schuster and Anders Krogh
- Abstract summary: We present a simple generative model that computes model parameters and representations directly via maximum a posteriori (MAP) estimation.
The advantages of this approach are its simplicity and its capability to provide representations of much smaller dimensionality than a comparable VAE.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning low-dimensional representations of single-cell transcriptomics has
become instrumental to its downstream analysis. The state of the art is
currently represented by neural network models such as variational autoencoders
(VAEs) which use a variational approximation of the likelihood for inference.
We here present the Deep Generative Decoder (DGD), a simple generative model
that computes model parameters and representations directly via maximum a
posteriori (MAP) estimation. The DGD handles complex parameterized latent
distributions naturally unlike VAEs which typically use a fixed Gaussian
distribution, because of the complexity of adding other types. We first show
its general functionality on a commonly used benchmark set, Fashion-MNIST.
Secondly, we apply the model to multiple single-cell data sets. Here the DGD
learns low-dimensional, meaningful and well-structured latent representations
with sub-clustering beyond the provided labels. The advantages of this approach
are its simplicity and its capability to provide representations of much
smaller dimensionality than a comparable VAE.
Related papers
- Scalable Amortized GPLVMs for Single Cell Transcriptomics Data [9.010523724015398]
Dimensionality reduction is crucial for analyzing large-scale single-cell RNA-seq data.
We introduce an improved model, the amortized variational model (BGPLVM)
BGPLVM is tailored for single-cell RNA-seq with specialized encoder, kernel, and likelihood designs.
arXiv Detail & Related papers (2024-05-06T21:54:38Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - RENs: Relevance Encoding Networks [0.0]
This paper proposes relevance encoding networks (RENs): a novel probabilistic VAE-based framework that uses the automatic relevance determination (ARD) prior in the latent space to learn the data-specific bottleneck dimensionality.
We show that the proposed model learns the relevant latent bottleneck dimensionality without compromising the representation and generation quality of the samples.
arXiv Detail & Related papers (2022-05-25T21:53:48Z) - uGLAD: Sparse graph recovery by optimizing deep unrolled networks [11.48281545083889]
We present a novel technique to perform sparse graph recovery by optimizing deep unrolled networks.
Our model, uGLAD, builds upon and extends the state-of-the-art model GLAD to the unsupervised setting.
We evaluate model results on synthetic Gaussian data, non-Gaussian data generated from Gene Regulatory Networks, and present a case study in anaerobic digestion.
arXiv Detail & Related papers (2022-05-23T20:20:27Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Exponentially Tilted Gaussian Prior for Variational Autoencoder [3.52359746858894]
Recent studies show that probabilistic generative models can perform poorly on this task.
We propose the exponentially tilted Gaussian prior distribution for the Variational Autoencoder (VAE)
We show that our model produces high quality image samples which are more crisp than that of a standard Gaussian VAE.
arXiv Detail & Related papers (2021-11-30T18:28:19Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Multivariate Data Explanation by Jumping Emerging Patterns Visualization [78.6363825307044]
We present VAX (multiVariate dAta eXplanation), a new VA method to support the identification and visual interpretation of patterns in multivariate data sets.
Unlike the existing similar approaches, VAX uses the concept of Jumping Emerging Patterns to identify and aggregate several diversified patterns, producing explanations through logic combinations of data variables.
arXiv Detail & Related papers (2021-06-21T13:49:44Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - tvGP-VAE: Tensor-variate Gaussian Process Prior Variational Autoencoder [0.0]
tvGP-VAE is able to explicitly model correlation via the use of kernel functions.
We show that the choice of which correlation structures to explicitly represent in the latent space has a significant impact on model performance.
arXiv Detail & Related papers (2020-06-08T17:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.