Diffusion Model with Perceptual Loss
- URL: http://arxiv.org/abs/2401.00110v7
- Date: Sat, 08 Mar 2025 03:14:40 GMT
- Title: Diffusion Model with Perceptual Loss
- Authors: Shanchuan Lin, Xiao Yang,
- Abstract summary: We show that the choice of the loss objective is the underlying reason raw diffusion models fail to generate desirable samples.<n>We train a diffusion model with a novel self-perceptual loss objective and obtain much more realistic samples without the need for guidance.
- Score: 3.9571411466709847
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models without guidance generate very unrealistic samples. Guidance is used ubiquitously, and previous research has attributed its effect to low-temperature sampling that improves quality by trading off diversity. However, this perspective is incomplete. Our research shows that the choice of the loss objective is the underlying reason raw diffusion models fail to generate desirable samples. In this paper, (1) our analysis shows that the loss objective plays an important role in shaping the learned distribution and the MSE loss derived from theories holds assumptions that misalign with data in practice; (2) we explain the effectiveness of guidance methods from a new perspective of perceptual supervision; (3) we validate our hypothesis by training a diffusion model with a novel self-perceptual loss objective and obtaining much more realistic samples without the need for guidance. We hope our work paves the way for future explorations of the diffusion loss objective.
Related papers
- Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment.
First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population.
We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z) - Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning [27.7568230759712]
We take the first step towards understanding the sample efficiency of transfer learning conditional diffusion models through the lens of representation learning.
Our analysis shows that with a well-learned representation from source tasks, the samplecomplexity of target tasks can be reduced substantially.
arXiv Detail & Related papers (2025-02-06T20:39:03Z) - Towards Understanding Extrapolation: a Causal Lens [53.15488984371969]
We provide a theoretical understanding of when extrapolation is possible and offer principled methods to achieve it.
Under this formulation, we cast the extrapolation problem into a latent-variable identification problem.
Our theory reveals the intricate interplay between the underlying manifold's smoothness and the shift properties.
arXiv Detail & Related papers (2025-01-15T21:29:29Z) - Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model [22.39558434131574]
Existing data attribution methods for diffusion models typically quantify the contribution of a training sample.
We argue that the direct usage of diffusion loss cannot represent such a contribution accurately due to the calculation of diffusion loss.
We aim to measure the direct comparison between predicted distributions with an attribution score to analyse the training sample importance.
arXiv Detail & Related papers (2024-10-24T10:58:17Z) - Diffusion Models in Low-Level Vision: A Survey [82.77962165415153]
diffusion model-based solutions have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity.
We present three generic diffusion modeling frameworks and explore their correlations with other deep generative models.
We summarize extended diffusion models applied in other tasks, including medical, remote sensing, and video scenarios.
arXiv Detail & Related papers (2024-06-17T01:49:27Z) - Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy.
As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized
Control [54.132297393662654]
Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins.
While diffusion models are trained to represent the distribution in the training dataset, we often are more concerned with other properties, such as the aesthetic quality of the generated images.
We present theoretical and empirical evidence that demonstrates our framework is capable of efficiently generating diverse samples with high genuine rewards.
arXiv Detail & Related papers (2024-02-23T08:54:42Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Unmasking Bias in Diffusion Model Training [40.90066994983719]
Denoising diffusion models have emerged as a dominant approach for image generation.
They still suffer from slow convergence in training and color shift issues in sampling.
In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm.
arXiv Detail & Related papers (2023-10-12T16:04:41Z) - GSURE-Based Diffusion Model Training with Corrupted Data [35.56267114494076]
We propose a novel training technique for generative diffusion models based only on corrupted data.
We demonstrate our technique on face images as well as Magnetic Resonance Imaging (MRI)
arXiv Detail & Related papers (2023-05-22T15:27:20Z) - Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling.
We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - Practical Insights of Repairing Model Problems on Image Classification [3.2932371462787513]
Additional training of a deep learning model can cause negative effects on the results, turning an initially positive sample into a negative one (degradation)
In this talk, we will present implications derived from a comparison of methods for reducing degradation.
The results imply that a practitioner should care about better method continuously considering dataset availability and life cycle of an AI system.
arXiv Detail & Related papers (2022-05-14T19:28:55Z) - Mind the Trade-off: Debiasing NLU Models without Degrading the
In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization.
It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples.
We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.