Latent Traversals in Generative Models as Potential Flows
- URL: http://arxiv.org/abs/2304.12944v2
- Date: Sat, 1 Jul 2023 11:21:34 GMT
- Title: Latent Traversals in Generative Models as Potential Flows
- Authors: Yue Song, T. Anderson Keller, Nicu Sebe, Max Welling
- Abstract summary: We propose to model latent structures with a learned dynamic potential landscape.
Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations.
Our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines.
- Score: 113.4232528843775
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Despite the significant recent progress in deep generative models, the
underlying structure of their latent spaces is still poorly understood, thereby
making the task of performing semantically meaningful latent traversals an open
research challenge. Most prior work has aimed to solve this challenge by
modeling latent structures linearly, and finding corresponding linear
directions which result in `disentangled' generations. In this work, we instead
propose to model latent structures with a learned dynamic potential landscape,
thereby performing latent traversals as the flow of samples down the
landscape's gradient. Inspired by physics, optimal transport, and neuroscience,
these potential landscapes are learned as physically realistic partial
differential equations, thereby allowing them to flexibly vary over both space
and time. To achieve disentanglement, multiple potentials are learned
simultaneously, and are constrained by a classifier to be distinct and
semantically self-consistent. Experimentally, we demonstrate that our method
achieves both more qualitatively and quantitatively disentangled trajectories
than state-of-the-art baselines. Further, we demonstrate that our method can be
integrated as a regularization term during training, thereby acting as an
inductive bias towards the learning of structured representations, ultimately
improving model likelihood on similarly structured data.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - State-space models can learn in-context by gradient descent [1.3087858009942543]
This study demonstrates that state-space model architectures can perform gradient-based learning and use it for in-context learning.
We prove that a single structured state-space model layer, augmented with local self-attention, can reproduce the outputs of an implicit linear model.
The theoretical construction elucidates the role of local self-attention and multiplicative interactions in recurrent architectures as the key ingredients for enabling the expressive power typical of foundation models.
arXiv Detail & Related papers (2024-10-15T15:22:38Z) - Neural Persistence Dynamics [8.197801260302642]
We consider the problem of learning the dynamics in the topology of time-evolving point clouds.
Our proposed model - $textitNeural Persistence Dynamics$ - substantially outperforms the state-of-the-art across a diverse set of parameter regression tasks.
arXiv Detail & Related papers (2024-05-24T17:20:18Z) - Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations.
We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z) - Exploring Model Transferability through the Lens of Potential Energy [78.60851825944212]
Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models.
Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labels.
We present an insightful physics-inspired approach named PED to address these challenges.
arXiv Detail & Related papers (2023-08-29T07:15:57Z) - Exploring Compositional Visual Generation with Latent Classifier
Guidance [19.48538300223431]
We train latent diffusion models and auxiliary latent classifiers to facilitate non-linear navigation of latent representation generation.
We show that such conditional generation achieved by latent classifier guidance provably maximizes a lower bound of the conditional log probability during training.
We show that this paradigm based on latent classifier guidance is agnostic to pre-trained generative models, and present competitive results for both image generation and sequential manipulation of real and synthetic images.
arXiv Detail & Related papers (2023-04-25T03:02:58Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Planning with Diffusion for Flexible Behavior Synthesis [125.24438991142573]
We consider what it would look like to fold as much of the trajectory optimization pipeline as possible into the modeling problem.
The core of our technical approach lies in a diffusion probabilistic model that plans by iteratively denoising trajectories.
arXiv Detail & Related papers (2022-05-20T07:02:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.