Related papers: Automated Learning of Semantic Embedding Representations for Diffusion Models

Automated Learning of Semantic Embedding Representations for Diffusion Models

URL: http://arxiv.org/abs/2505.05732v1
Date: Fri, 09 May 2025 02:10:46 GMT
Title: Automated Learning of Semantic Embedding Representations for Diffusion Models
Authors: Limai Jiang, Yunpeng Cai,
Abstract summary: We employ a multi-level denoising autoencoder framework to expand the representation capacity of denoising diffusion models.<n>Our work justifies that DDMs are not only suitable for generative tasks, but also potentially advantageous for general-purpose deep learning applications.
Score: 1.688134675717698
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative models capture the true distribution of data, yielding semantically rich representations. Denoising diffusion models (DDMs) exhibit superior generative capabilities, though efficient representation learning for them are lacking. In this work, we employ a multi-level denoising autoencoder framework to expand the representation capacity of DDMs, which introduces sequentially consistent Diffusion Transformers and an additional timestep-dependent encoder to acquire embedding representations on the denoising Markov chain through self-conditional diffusion learning. Intuitively, the encoder, conditioned on the entire diffusion process, compresses high-dimensional data into directional vectors in latent under different noise levels, facilitating the learning of image embeddings across all timesteps. To verify the semantic adequacy of embeddings generated through this approach, extensive experiments are conducted on various datasets, demonstrating that optimally learned embeddings by DDMs surpass state-of-the-art self-supervised representation learning methods in most cases, achieving remarkable discriminative semantic representation quality. Our work justifies that DDMs are not only suitable for generative tasks, but also potentially advantageous for general-purpose deep learning applications.

Related papers

On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning [14.707830064594056]
Diffusion autoencoders (DAs) use an input-dependent latent variable to capture representations alongside the diffusion process.<n>Better generative modelling is the primary goal of another class of diffusion models -- those that learn their forward (noising) process.
arXiv Detail & Related papers (2025-05-30T18:14:09Z)
DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning [53.27049077100897]
generative pre-training has been shown to yield discriminative representations, paving the way towards unified visual generation and understanding.<n>This work introduces self-conditioning, a mechanism that internally leverages the rich semantics inherent in denoising network to guide its own decoding layers.<n>Results are compelling: our method boosts both generation FID and recognition accuracy with 1% computational overhead and generalizes across diverse diffusion architectures.
arXiv Detail & Related papers (2025-05-16T08:47:16Z)
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax [73.03684002513218]
We enhance Deep InfoMax (DIM) to enable automatic matching of learned representations to a selected prior distribution.<n>We show that such modification allows for learning uniformly and normally distributed representations.<n>The results indicate a moderate trade-off between the performance on the downstream tasks and quality of DM.
arXiv Detail & Related papers (2024-10-09T15:40:04Z)
Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations. We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs) EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding. Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z)
Denoising Diffusion Autoencoders are Unified Self-supervised Learners [58.194184241363175]
This paper shows that the networks in diffusion models, namely denoising diffusion autoencoders (DDAE), are unified self-supervised learners. DDAE has already learned strongly linear-separable representations within its intermediate layers without auxiliary encoders. Our diffusion-based approach achieves 95.9% and 50.0% linear evaluation accuracies on CIFAR-10 and Tiny-ImageNet.
arXiv Detail & Related papers (2023-03-17T04:20:47Z)
Representation Learning with Diffusion Models [0.0]
Diffusion models (DMs) have achieved state-of-the-art results for image synthesis tasks as well as density estimation. We introduce a framework for learning such representations with diffusion models (LRDM) In particular, the DM and the representation encoder are trained jointly in order to learn rich representations specific to the generative denoising process.
arXiv Detail & Related papers (2022-10-20T07:26:47Z)
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation [56.04628143914542]
Diffusion models (DMs) have recently emerged as SoTA tools for generative modeling in various domains. We propose f-DM, a generalized family of DMs which allows progressive signal transformation. We apply f-DM in image generation tasks with a range of functions, including down-sampling, blurring, and learned transformations.
arXiv Detail & Related papers (2022-10-10T18:49:25Z)
Diffusion-Based Representation Learning [65.55681678004038]
We augment the denoising score matching framework to enable representation learning without any supervised signal. In contrast, the introduced diffusion-based representation learning relies on a new formulation of the denoising score matching objective. Using the same approach, we propose to learn an infinite-dimensional latent code that achieves improvements of state-of-the-art models on semi-supervised image classification.
arXiv Detail & Related papers (2021-05-29T09:26:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.