Related papers: Isometric Representation Learning for Disentangled Latent Space of Diffusion Models

Isometric Representation Learning for Disentangled Latent Space of Diffusion Models

URL: http://arxiv.org/abs/2407.11451v1
Date: Tue, 16 Jul 2024 07:36:01 GMT
Title: Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
Authors: Jaehoon Hahm, Junho Lee, Sunghyun Kim, Joonseok Lee,
Abstract summary: We present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space of the training data manifold. This approach allows diffusion models to learn a more disentangled latent space, which enables smoother, precise more accurate inversion, and more control over attributes directly in the latent space.
Score: 17.64488229224982
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its latent space to image space. To tackle this problem, we present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space of the training data manifold. This approach allows diffusion models to learn a more disentangled latent space, which enables smoother interpolation, more accurate inversion, and more precise control over attributes directly in the latent space. Our extensive experiments consisting of image interpolations, image inversions, and linear editing show the effectiveness of our method.

Related papers

Image Interpolation with Score-based Riemannian Metrics of Diffusion Models [9.514940899499752]
This paper introduces a novel framework that treats the data space of pre-trained diffusion models as a Riemannian manifold. Experiments with MNIST and Stable Diffusion show that this geometry-aware approach yields images that are more realistic, less noisy, and more faithful to prompts than existing methods.
arXiv Detail & Related papers (2025-04-28T22:04:20Z)
Continuous Diffusion Model for Language Modeling [57.396578974401734]
Existing continuous diffusion models for discrete data have limited performance compared to discrete approaches. We propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution.
arXiv Detail & Related papers (2025-02-17T08:54:29Z)
Exploring the latent space of diffusion models directly through singular value decomposition [31.900933527692846]
We propose a novel image editing framework that is capable of learning arbitrary attributes from one pair of latent codes destined by text prompts in Diffusion Models. We will release our codes soon to foster further research and applications in this area.
arXiv Detail & Related papers (2025-02-04T11:04:36Z)
Geometric Trajectory Diffusion Models [58.853975433383326]
Generative models have shown great promise in generating 3D geometric systems. Existing approaches only operate on static structures, neglecting the fact that physical systems are always dynamic in nature. We propose geometric trajectory diffusion models (GeoTDM), the first diffusion model for modeling the temporal distribution of 3D geometric trajectories.
arXiv Detail & Related papers (2024-10-16T20:36:41Z)
Towards diffusion models for large-scale sea-ice modelling [0.4498088099418789]
We tailor latent diffusion models to sea-ice physics with a censored Gaussian distribution in data space to generate data that follows the physical bounds of the modelled variables. Our latent diffusion models reach similar scores as the diffusion model trained in data space, but they smooth the generated fields as caused by the latent mapping. For large-scale Earth system modelling, latent diffusion models can have many advantages compared to diffusion in data space if the significant barrier of smoothing can be resolved.
arXiv Detail & Related papers (2024-06-26T15:11:15Z)
Interpreting the Weight Space of Customized Diffusion Models [79.14866339932199]
We show that the weight space of fine-tuned diffusion models can behave as an interpretable meta-latent space producing new models. Our results indicate that the weight space of fine-tuned diffusion models can behave as an interpretable meta-latent space producing new models.
arXiv Detail & Related papers (2024-06-13T17:59:56Z)
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models [82.8261101680427]
Smooth latent spaces ensure that a perturbation on an input latent corresponds to a steady change in the output image. This property proves beneficial in downstream tasks, including image inversion, inversion, and editing. We propose Smooth Diffusion, a new category of diffusion models that can be simultaneously high-performing and smooth.
arXiv Detail & Related papers (2023-12-07T16:26:23Z)
Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold. We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z)
Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling. We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z)
SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI [14.545736786515837]
We introduce SPIRiT-Diffusion, a diffusion model for k-space inspired by the iterative self-consistent SPIRiT method. We evaluate the proposed SPIRiT-Diffusion method using a 3D joint intracranial and carotid vessel wall imaging dataset.
arXiv Detail & Related papers (2023-04-11T08:43:52Z)
DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models [23.70476220346754]
We propose a novel guidance approach for diffusion models that uses estimated depth information derived from the rich intermediate representations of diffusion models. Experiments and extensive ablation studies demonstrate the effectiveness of our method in guiding the diffusion models toward geometrically plausible image generation.
arXiv Detail & Related papers (2022-12-17T12:47:19Z)
Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains. Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.