Toward the Identifiability of Comparative Deep Generative Models
- URL: http://arxiv.org/abs/2401.15903v1
- Date: Mon, 29 Jan 2024 06:10:54 GMT
- Title: Toward the Identifiability of Comparative Deep Generative Models
- Authors: Romain Lopez, Jan-Christian Huetter, Ehsan Hajiramezanali, Jonathan
Pritchard and Aviv Regev
- Abstract summary: We propose a theory of identifiability for comparative Deep Generative Models (DGMs)
We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine.
We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance.
- Score: 7.5479347719819865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Generative Models (DGMs) are versatile tools for learning data
representations while adequately incorporating domain knowledge such as the
specification of conditional probability distributions. Recently proposed DGMs
tackle the important task of comparing data sets from different sources. One
such example is the setting of contrastive analysis that focuses on describing
patterns that are enriched in a target data set compared to a background data
set. The practical deployment of those models often assumes that DGMs naturally
infer interpretable and modular latent representations, which is known to be an
issue in practice. Consequently, existing methods often rely on ad-hoc
regularization schemes, although without any theoretical grounding. Here, we
propose a theory of identifiability for comparative DGMs by extending recent
advances in the field of non-linear independent component analysis. We show
that, while these models lack identifiability across a general class of mixing
functions, they surprisingly become identifiable when the mixing function is
piece-wise affine (e.g., parameterized by a ReLU neural network). We also
investigate the impact of model misspecification, and empirically show that
previously proposed regularization techniques for fitting comparative DGMs help
with identifiability when the number of latent variables is not known in
advance. Finally, we introduce a novel methodology for fitting comparative DGMs
that improves the treatment of multiple data sources via multi-objective
optimization and that helps adjust the hyperparameter for the regularization in
an interpretable manner, using constrained optimization. We empirically
validate our theory and new methodology using simulated data as well as a
recent data set of genetic perturbations in cells profiled via single-cell RNA
sequencing.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Bayesian tomography using polynomial chaos expansion and deep generative
networks [0.0]
We present a strategy combining the excellent reconstruction performances of a variational autoencoder (VAE) with the accuracy of PCA-PCE surrogate modeling.
Within the MCMC process, the parametrization of the VAE is leveraged for prior exploration and sample proposals.
arXiv Detail & Related papers (2023-07-09T16:44:37Z) - Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models [32.52492468276371]
We propose regularized deep generative model (Reg-DGM) to reduce the variance of generative modeling with limited data.
Reg-DGM uses a pre-trained model to optimize a weighted sum of a certain divergence and the expectation of an energy function.
Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data.
arXiv Detail & Related papers (2022-08-30T10:28:50Z) - ER: Equivariance Regularizer for Knowledge Graph Completion [107.51609402963072]
We propose a new regularizer, namely, Equivariance Regularizer (ER)
ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities.
The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-24T08:18:05Z) - Scalable Regularised Joint Mixture Models [2.0686407686198263]
In many applications, data can be heterogeneous in the sense of spanning latent groups with different underlying distributions.
We propose an approach for heterogeneous data that allows joint learning of (i) explicit multivariate feature distributions, (ii) high-dimensional regression models and (iii) latent group labels.
The approach is demonstrably effective in high dimensions, combining data reduction for computational efficiency with a re-weighting scheme that retains key signals even when the number of features is large.
arXiv Detail & Related papers (2022-05-03T13:38:58Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Nonparametric Functional Analysis of Generalized Linear Models Under
Nonlinear Constraints [0.0]
This article introduces a novel nonparametric methodology for Generalized Linear Models.
It combines the strengths of the binary regression and latent variable formulations for categorical data.
It extends recently published parametric versions of the methodology and generalizes it.
arXiv Detail & Related papers (2021-10-11T04:49:59Z) - Post-mortem on a deep learning contest: a Simpson's paradox and the
complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models.
We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data.
We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z) - Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems.
A theory has shown the importance of the gradient descent (GD) to globally optimal solutions.
We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.