Unsupervised Representation Learning from Pre-trained Diffusion
Probabilistic Models
- URL: http://arxiv.org/abs/2212.12990v1
- Date: Mon, 26 Dec 2022 02:37:38 GMT
- Title: Unsupervised Representation Learning from Pre-trained Diffusion
Probabilistic Models
- Authors: Zijian Zhang, Zhou Zhao, Zhijie Lin
- Abstract summary: Diffusion Probabilistic Models (DPMs) have shown a powerful capacity of generating high-quality image samples.
Diff-AE have been proposed to explore DPMs for representation learning via autoencoding.
We propose textbfPre-trained textbfAutotextbfEncoding (textbfPDAE) to adapt existing pre-trained DPMs to the decoders for image reconstruction.
- Score: 83.75414370493289
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion Probabilistic Models (DPMs) have shown a powerful capacity of
generating high-quality image samples. Recently, diffusion autoencoders
(Diff-AE) have been proposed to explore DPMs for representation learning via
autoencoding. Their key idea is to jointly train an encoder for discovering
meaningful representations from images and a conditional DPM as the decoder for
reconstructing images. Considering that training DPMs from scratch will take a
long time and there have existed numerous pre-trained DPMs, we propose
\textbf{P}re-trained \textbf{D}PM \textbf{A}uto\textbf{E}ncoding
(\textbf{PDAE}), a general method to adapt existing pre-trained DPMs to the
decoders for image reconstruction, with better training efficiency and
performance than Diff-AE. Specifically, we find that the reason that
pre-trained DPMs fail to reconstruct an image from its latent variables is due
to the information loss of forward process, which causes a gap between their
predicted posterior mean and the true one. From this perspective, the
classifier-guided sampling method can be explained as computing an extra mean
shift to fill the gap, reconstructing the lost class information in samples.
These imply that the gap corresponds to the lost information of the image, and
we can reconstruct the image by filling the gap. Drawing inspiration from this,
we employ a trainable model to predict a mean shift according to encoded
representation and train it to fill as much gap as possible, in this way, the
encoder is forced to learn as much information as possible from images to help
the filling. By reusing a part of network of pre-trained DPMs and redesigning
the weighting scheme of diffusion loss, PDAE can learn meaningful
representations from images efficiently. Extensive experiments demonstrate the
effectiveness, efficiency and flexibility of PDAE.
Related papers
- NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement [4.841365627573421]
A crucial preprocessing step is essential to eliminate noise while preserving text and key features of documents.
We propose NAF-DPM, a novel generative framework based on a diffusion probabilistic model (DPM) designed to restore the original quality of degraded documents.
arXiv Detail & Related papers (2024-04-08T16:52:21Z) - MirrorDiffusion: Stabilizing Diffusion Process in Zero-shot Image
Translation by Prompts Redescription and Beyond [57.14128305383768]
We propose a prompt redescription strategy to realize a mirror effect between the source and reconstructed image in the diffusion model (MirrorDiffusion)
MirrorDiffusion achieves superior performance over the state-of-the-art methods on zero-shot image translation benchmarks.
arXiv Detail & Related papers (2024-01-06T14:12:16Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - Diffusion Model as Representation Learner [86.09969334071478]
Diffusion Probabilistic Models (DPMs) have recently demonstrated impressive results on various generative tasks.
We propose a novel knowledge transfer method that leverages the knowledge acquired by DPMs for recognition tasks.
arXiv Detail & Related papers (2023-08-21T00:38:39Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Unleashing Text-to-Image Diffusion Models for Visual Perception [84.41514649568094]
VPD (Visual Perception with a pre-trained diffusion model) is a new framework that exploits the semantic information of a pre-trained text-to-image diffusion model in visual perception tasks.
We show that VPD can be faster adapted to downstream visual perception tasks using the proposed VPD.
arXiv Detail & Related papers (2023-03-03T18:59:47Z) - Representation Learning with Diffusion Models [0.0]
Diffusion models (DMs) have achieved state-of-the-art results for image synthesis tasks as well as density estimation.
We introduce a framework for learning such representations with diffusion models (LRDM)
In particular, the DM and the representation encoder are trained jointly in order to learn rich representations specific to the generative denoising process.
arXiv Detail & Related papers (2022-10-20T07:26:47Z) - DDPM-CD: Denoising Diffusion Probabilistic Models as Feature Extractors
for Change Detection [31.125812018296127]
We introduce a novel approach for change detection by pre-training a Deno Diffusionising Probabilistic Model (DDPM)
DDPM learns the training data distribution by gradually converting training images into a Gaussian distribution using a Markov chain.
During inference (i.e., sampling), they can generate a diverse set of samples closer to the training distribution.
Experiments conducted on the LEVIR-CD, WHU-CD, DSIFN-CD, and CDD datasets demonstrate that the proposed DDPM-CD method significantly outperforms the existing change detection methods in terms of F1 score, I
arXiv Detail & Related papers (2022-06-23T17:58:29Z) - Diffusion Autoencoders: Toward a Meaningful and Decodable Representation [1.471992435706872]
Diffusion models (DPMs) have achieved remarkable quality in image generation that rivals GANs'.
Unlike GANs, DPMs use a set of latent variables that lack semantic meaning and cannot serve as a useful representation for other tasks.
This paper explores the possibility of using DPMs for representation learning and seeks to extract a meaningful and decodable representation of an input image via autoencoding.
arXiv Detail & Related papers (2021-11-30T18:24:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.