Interpreting the Weight Space of Customized Diffusion Models
- URL: http://arxiv.org/abs/2406.09413v3
- Date: Fri, 22 Nov 2024 05:12:30 GMT
- Title: Interpreting the Weight Space of Customized Diffusion Models
- Authors: Amil Dravid, Yossi Gandelsman, Kuan-Chieh Wang, Rameen Abdal, Gordon Wetzstein, Alexei A. Efros, Kfir Aberman,
- Abstract summary: We show that the weight space of fine-tuned diffusion models can behave as an interpretable meta-latent space producing new models.
Our results indicate that the weight space of fine-tuned diffusion models can behave as an interpretable meta-latent space producing new models.
- Score: 79.14866339932199
- License:
- Abstract: We investigate the space of weights spanned by a large collection of customized diffusion models. We populate this space by creating a dataset of over 60,000 models, each of which is a base model fine-tuned to insert a different person's visual identity. We model the underlying manifold of these weights as a subspace, which we term weights2weights. We demonstrate three immediate applications of this space that result in new diffusion models -- sampling, editing, and inversion. First, sampling a set of weights from this space results in a new model encoding a novel identity. Next, we find linear directions in this space corresponding to semantic edits of the identity (e.g., adding a beard), resulting in a new model with the original identity edited. Finally, we show that inverting a single image into this space encodes a realistic identity into a model, even if the input image is out of distribution (e.g., a painting). We further find that these linear properties of the diffusion model weight space extend to other visual concepts. Our results indicate that the weight space of fine-tuned diffusion models can behave as an interpretable meta-latent space producing new models.
Related papers
- Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective [52.778766190479374]
Latent-based image generative models have achieved notable success in image generation tasks.
Despite sharing the same latent space, autoregressive models significantly lag behind LDMs and MIMs in image generation.
We propose a simple but effective discrete image tokenizer to stabilize the latent space for image generative modeling.
arXiv Detail & Related papers (2024-10-16T12:13:17Z) - Universal New Physics Latent Space [0.0]
We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space.
We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clustered according to their LHC phenomenology.
arXiv Detail & Related papers (2024-07-29T18:00:00Z) - Isometric Representation Learning for Disentangled Latent Space of Diffusion Models [17.64488229224982]
We present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide the model to learn a geometrically sound latent space of the training data manifold.
This approach allows diffusion models to learn a more disentangled latent space, which enables smoother, precise more accurate inversion, and more control over attributes directly in the latent space.
arXiv Detail & Related papers (2024-07-16T07:36:01Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - Unifying Diffusion Models' Latent Space, with Applications to
CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains.
Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z) - OCD: Learning to Overfit with Conditional Diffusion Models [95.1828574518325]
We present a dynamic model in which the weights are conditioned on an input sample x.
We learn to match those weights that would be obtained by finetuning a base model on x and its label y.
arXiv Detail & Related papers (2022-10-02T09:42:47Z) - Tensor-based Subspace Factorization for StyleGAN [1.1470070927586016]
$tau$GAN is a tensor-based method for modeling the latent space of generative models.
We validate our approach on StyleGAN trained on FFHQ using BU-3DFE as a structured facial expression database.
arXiv Detail & Related papers (2021-11-08T15:11:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.