Exploring Diffusion Time-steps for Unsupervised Representation Learning
- URL: http://arxiv.org/abs/2401.11430v1
- Date: Sun, 21 Jan 2024 08:35:25 GMT
- Title: Exploring Diffusion Time-steps for Unsupervised Representation Learning
- Authors: Zhongqi Yue, Jiankun Wang, Qianru Sun, Lei Ji, Eric I-Chao Chang,
Hanwang Zhang
- Abstract summary: We build a theoretical framework that connects the diffusion time-steps and the hidden attributes.
On CelebA, FFHQ, and Bedroom datasets, the learned feature significantly improves classification.
- Score: 72.43246871893936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Representation learning is all about discovering the hidden modular
attributes that generate the data faithfully. We explore the potential of
Denoising Diffusion Probabilistic Model (DM) in unsupervised learning of the
modular attributes. We build a theoretical framework that connects the
diffusion time-steps and the hidden attributes, which serves as an effective
inductive bias for unsupervised learning. Specifically, the forward diffusion
process incrementally adds Gaussian noise to samples at each time-step, which
essentially collapses different samples into similar ones by losing attributes,
e.g., fine-grained attributes such as texture are lost with less noise added
(i.e., early time-steps), while coarse-grained ones such as shape are lost by
adding more noise (i.e., late time-steps). To disentangle the modular
attributes, at each time-step t, we learn a t-specific feature to compensate
for the newly lost attribute, and the set of all 1,...,t-specific features,
corresponding to the cumulative set of lost attributes, are trained to make up
for the reconstruction error of a pre-trained DM at time-step t. On CelebA,
FFHQ, and Bedroom datasets, the learned feature significantly improves
attribute classification and enables faithful counterfactual generation, e.g.,
interpolating only one specified attribute between two images, validating the
disentanglement quality. Codes are in https://github.com/yue-zhongqi/diti.
Related papers
- Multiple Descents in Unsupervised Learning: The Role of Noise, Domain Shift and Anomalies [14.399035468023161]
We study the presence of double descent in unsupervised learning, an area that has received little attention and is not yet fully understood.
We use synthetic and real data and identify model-wise, epoch-wise, and sample-wise double descent for various applications.
arXiv Detail & Related papers (2024-06-17T16:24:23Z) - Few-shot Learner Parameterization by Diffusion Time-steps [133.98320335394004]
Few-shot learning is still challenging when using large multi-modal foundation models.
We propose Time-step Few-shot (TiF) learner to make up for lost attributes.
TiF learner significantly outperforms OpenCLIP and its adapters on a variety of fine-grained and customized few-shot learning tasks.
arXiv Detail & Related papers (2024-03-05T04:38:13Z) - Exploiting Semantic Attributes for Transductive Zero-Shot Learning [97.61371730534258]
Zero-shot learning aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes.
We present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process.
Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
arXiv Detail & Related papers (2023-03-17T09:09:48Z) - Attribute Graphs Underlying Molecular Generative Models: Path to Learning with Limited Data [42.517927809224275]
We provide an algorithm that relies on perturbation experiments on latent codes of a pre-trained generative autoencoder to uncover an attribute graph.
We show that one can fit an effective graphical model that models a structural equation model between latent codes.
Using a pre-trained generative autoencoder trained on a large dataset of small molecules, we demonstrate that the graphical model can be used to predict a specific property.
arXiv Detail & Related papers (2022-07-14T19:20:30Z) - Adaptive Memory Networks with Self-supervised Learning for Unsupervised
Anomaly Detection [54.76993389109327]
Unsupervised anomaly detection aims to build models to detect unseen anomalies by only training on the normal data.
We propose a novel approach called Adaptive Memory Network with Self-supervised Learning (AMSL) to address these challenges.
AMSL incorporates a self-supervised learning module to learn general normal patterns and an adaptive memory fusion module to learn rich feature representations.
arXiv Detail & Related papers (2022-01-03T03:40:21Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.