Related papers: Diffusion Model with Perceptual Loss

Diffusion Model with Perceptual Loss

URL: http://arxiv.org/abs/2401.00110v5
Date: Wed, 6 Mar 2024 20:13:53 GMT
Title: Diffusion Model with Perceptual Loss
Authors: Shanchuan Lin, Xiao Yang
Abstract summary: Diffusion models trained with mean squared error loss tend to generate unrealistic samples. We show that the effectiveness of classifier-free guidance partly originates from it being a form of implicit perceptual guidance. We propose a novel self-perceptual objective that results in diffusion models capable of generating more realistic samples.
Score: 4.67483805599143
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality, yet its surprising effectiveness is not fully understood. In this paper, we show that the effectiveness of classifier-free guidance partly originates from it being a form of implicit perceptual guidance. As a result, we can directly incorporate perceptual loss in diffusion training to improve sample quality. Since the score matching objective used in diffusion training strongly resembles the denoising autoencoder objective used in unsupervised training of perceptual networks, the diffusion model itself is a perceptual network and can be used to generate meaningful perceptual loss. We propose a novel self-perceptual objective that results in diffusion models capable of generating more realistic samples. For conditional generation, our method only improves sample quality without entanglement with the conditional input and therefore does not sacrifice sample diversity. Our method can also improve sample quality for unconditional generation, which was not possible with classifier-free guidance before.

Related papers

Provable Maximum Entropy Manifold Exploration via Diffusion Models [58.89696361871563]
Exploration is critical for solving real-world decision-making problems such as scientific discovery.<n>We introduce a novel framework that casts exploration as entropy over approximate data manifold implicitly defined by a pre-trained diffusion model.<n>We develop an algorithm based on mirror descent that solves the exploration problem as sequential fine-tuning of a pre-trained diffusion model.
arXiv Detail & Related papers (2025-06-18T11:59:15Z)
Provable Efficiency of Guidance in Diffusion Models for General Data Distribution [7.237817437521988]
Diffusion models have emerged as a powerful framework for generative modeling.<n>Guidance techniques play a crucial role in enhancing sample quality.<n>Existing studies only focus on case studies, where the distribution conditioned on each class is either isotropic Gaussian or supported on a one-dimensional interval with some extra conditions.
arXiv Detail & Related papers (2025-05-02T16:46:43Z)
Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment. First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population. We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z)
Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning [27.7568230759712]
We take the first step towards understanding the sample efficiency of transfer learning conditional diffusion models through the lens of representation learning. Our analysis shows that with a well-learned representation from source tasks, the samplecomplexity of target tasks can be reduced substantially.
arXiv Detail & Related papers (2025-02-06T20:39:03Z)
Towards Understanding Extrapolation: a Causal Lens [53.15488984371969]
We provide a theoretical understanding of when extrapolation is possible and offer principled methods to achieve it. Under this formulation, we cast the extrapolation problem into a latent-variable identification problem. Our theory reveals the intricate interplay between the underlying manifold's smoothness and the shift properties.
arXiv Detail & Related papers (2025-01-15T21:29:29Z)
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model [22.39558434131574]
Existing data attribution methods for diffusion models typically quantify the contribution of a training sample. We argue that the direct usage of diffusion loss cannot represent such a contribution accurately due to the calculation of diffusion loss. We aim to measure the direct comparison between predicted distributions with an attribution score to analyse the training sample importance.
arXiv Detail & Related papers (2024-10-24T10:58:17Z)
Diffusion Models in Low-Level Vision: A Survey [82.77962165415153]
diffusion model-based solutions have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity. We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models. We summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios.
arXiv Detail & Related papers (2024-06-17T01:49:27Z)
Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy. As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control [54.132297393662654]
Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins. While diffusion models are trained to represent the distribution in the training dataset, we often are more concerned with other properties, such as the aesthetic quality of the generated images. We present theoretical and empirical evidence that demonstrates our framework is capable of efficiently generating diverse samples with high genuine rewards.
arXiv Detail & Related papers (2024-02-23T08:54:42Z)
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts. We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep. We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z)
Unmasking Bias in Diffusion Model Training [40.90066994983719]
Denoising diffusion models have emerged as a dominant approach for image generation. They still suffer from slow convergence in training and color shift issues in sampling. In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm.
arXiv Detail & Related papers (2023-10-12T16:04:41Z)
GSURE-Based Diffusion Model Training with Corrupted Data [35.56267114494076]
We propose a novel training technique for generative diffusion models based only on corrupted data. We demonstrate our technique on face images as well as Magnetic Resonance Imaging (MRI)
arXiv Detail & Related papers (2023-05-22T15:27:20Z)
Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling. We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
Practical Insights of Repairing Model Problems on Image Classification [3.2932371462787513]
Additional training of a deep learning model can cause negative effects on the results, turning an initially positive sample into a negative one (degradation) In this talk, we will present implications derived from a comparison of methods for reducing degradation. The results imply that a practitioner should care about better method continuously considering dataset availability and life cycle of an AI system.
arXiv Detail & Related papers (2022-05-14T19:28:55Z)
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization. It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples. We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.