Related papers: Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections

Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections

URL: http://arxiv.org/abs/2404.02954v1
Date: Wed, 3 Apr 2024 18:00:00 GMT
Title: Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections
Authors: Gabriel Loaiza-Ganem, Brendan Leigh Ross, Rasa Hosseinzadeh, Anthony L. Caterini, Jesse C. Cresswell,
Abstract summary: We show that numerical instability of high-dimensional likelihoods is unavoidable when modelling low-dimensional data. We then show that DGMs on learned representations of autoencoders can be interpreted as approximately minimizing Wasserstein distance.
Score: 15.191007332508198
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years there has been increased interest in understanding the interplay between deep generative models (DGMs) and the manifold hypothesis. Research in this area focuses on understanding the reasons why commonly-used DGMs succeed or fail at learning distributions supported on unknown low-dimensional manifolds, as well as developing new models explicitly designed to account for manifold-supported data. This manifold lens provides both clarity as to why some DGMs (e.g. diffusion models and some generative adversarial networks) empirically surpass others (e.g. likelihood-based models such as variational autoencoders, normalizing flows, or energy-based models) at sample generation, and guidance for devising more performant DGMs. We carry out the first survey of DGMs viewed through this lens, making two novel contributions along the way. First, we formally establish that numerical instability of high-dimensional likelihoods is unavoidable when modelling low-dimensional data. We then show that DGMs on learned representations of autoencoders can be interpreted as approximately minimizing Wasserstein distance: this result, which applies to latent diffusion models, helps justify their outstanding empirical results. The manifold lens provides a rich perspective from which to understand DGMs, which we aim to make more accessible and widespread.

Related papers

Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions [6.9408143976091745]
Denoising Diffusion Probabilistic Models (DDPM) are powerful state-of-the-art methods used to generate synthetic data from high-dimensional data distributions. We study DDPMs under the manifold hypothesis and prove that they achieve rates independent of the ambient dimension in terms of learning the score. In terms of sampling, we obtain rates independent of the ambient dimension w.r.t. the Kullback-Leibler divergence, and $O(sqrtD)$ w.r.t. the Wasserstein distance.
arXiv Detail & Related papers (2024-09-27T14:57:18Z)
DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple but effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder. DEEM exhibits enhanced robustness and a superior capacity to alleviate model hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z)
Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning [80.44084021062105]
We propose a novel latent partial causal model for multimodal data, featuring two latent coupled variables, connected by an undirected edge, to represent the transfer of knowledge across modalities.<n>Under specific statistical assumptions, we establish an identifiability result, demonstrating that representations learned by multimodal contrastive learning correspond to the latent coupled variables up to a trivial transformation.<n>Experiments on a pre-trained CLIP model embodies disentangled representations, enabling few-shot learning and improving domain generalization across diverse real-world datasets.
arXiv Detail & Related papers (2024-02-09T07:18:06Z)
Understanding Deep Generative Models with Generalized Empirical Likelihoods [3.7978679293562587]
We show how to combine techniques from Maximum Mean Discrepancy and Generalized Empirical Likelihood to create distribution tests that retain per-sample interpretability. We find that such tests predict the degree of mode dropping and mode imbalance up to 60% better than metrics such as improved precision/recall.
arXiv Detail & Related papers (2023-06-16T11:33:47Z)
Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models. We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models. Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z)
Emerging Synergies in Causality and Deep Generative Models: A Survey [35.62192474181619]
Deep generative models (DGMs) have proven adept in capturing complex data distributions but often fall short in generalization and interpretability. causality offers a structured lens to comprehend the mechanisms driving data generation and highlights the causal-effect dynamics inherent in these processes. We elucidate the integration of causal principles within DGMs, investigate causal identification using DGMs, and navigate an emerging research frontier of causality in large-scale generative models.
arXiv Detail & Related papers (2023-01-29T04:10:12Z)
Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains. Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z)
A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models. They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
Riemannian Score-Based Generative Modeling [56.20669989459281]
We introduce score-based generative models (SGMs) demonstrating remarkable empirical performance. Current SGMs make the underlying assumption that the data is supported on a Euclidean manifold with flat geometry. This prevents the use of these models for applications in robotics, geoscience or protein modeling.
arXiv Detail & Related papers (2022-02-06T11:57:39Z)
Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems. A theory has shown the importance of the gradient descent (GD) to globally optimal solutions. We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z)
An Introduction to Deep Generative Modeling [8.909115457491522]
Deep generative models (DGM) are neural networks with many hidden layers trained to approximate complicated, high-dimensional probability distributions. We provide an introduction to DGMs and a framework for modeling the three most popular approaches. Our goal is to enable and motivate the reader to contribute to this proliferating research area.
arXiv Detail & Related papers (2021-03-09T02:19:06Z)
Flow-based Generative Models for Learning Manifold to Manifold Mappings [39.60406116984869]
We introduce three kinds of invertible layers for manifold-valued data, which are analogous to their functionality in flow-based generative models. We show promising results where we can reliably and accurately reconstruct brain images of a field of orientation distribution functions.
arXiv Detail & Related papers (2020-12-18T02:19:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.