Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
- URL: http://arxiv.org/abs/2408.00760v2
- Date: Tue, 1 Oct 2024 01:04:58 GMT
- Title: Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
- Authors: Susung Hong,
- Abstract summary: Conditional diffusion models have shown remarkable success in visual content generation.
Recent attempts to extend unconditional guidance have relied on techniques, resulting in suboptimal generation quality.
We propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach to enhance image generation.
- Score: 0.7770029179741429
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Conditional diffusion models have shown remarkable success in visual content generation, producing high-quality samples across various domains, largely due to classifier-free guidance (CFG). Recent attempts to extend guidance to unconditional models have relied on heuristic techniques, resulting in suboptimal generation quality and unintended effects. In this work, we propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach that leverages the energy-based perspective of the self-attention mechanism to enhance image generation. By defining the energy of self-attention, we introduce a method to reduce the curvature of the energy landscape of attention and use the output as the unconditional prediction. Practically, we control the curvature of the energy landscape by adjusting the Gaussian kernel parameter while keeping the guidance scale parameter fixed. Additionally, we present a query blurring method that is equivalent to blurring the entire attention weights without incurring quadratic complexity in the number of tokens. In our experiments, SEG achieves a Pareto improvement in both quality and the reduction of side effects. The code is available at https://github.com/SusungHong/SEG-SDXL.
Related papers
- SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation [83.52157311471693]
We introduce SeaDAG, a semi-autoregressive diffusion model for conditional generation of Directed Acyclic Graphs (DAGs)
Unlike conventional autoregressive generation that lacks a global graph structure view, our method maintains a complete graph structure at each diffusion step.
We explicitly train the model to learn graph conditioning with a condition loss, which enhances the diffusion model's capacity to generate realistic DAGs.
arXiv Detail & Related papers (2024-10-21T15:47:03Z) - Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach [11.878350833222711]
We propose a method called em GradSamp for sampling gradient updates from a Gaussian distribution.
em GradSamp not only streamlines gradient but also enables skipping entire epochs, thereby enhancing overall efficiency.
We rigorously validate our hypothesis across a diverse set of standard and non-standard CNN and transformer-based models.
arXiv Detail & Related papers (2024-06-11T15:01:20Z) - Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models.
Our method adjusts both singular values and their basis vectors of pretrained weights.
We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z) - Mitigating Over-Smoothing and Over-Squashing using Augmentations of Forman-Ricci Curvature [1.1126342180866644]
We propose a rewiring technique based on Augmented Forman-Ricci curvature (AFRC), a scalable curvature notation.
We prove that AFRC effectively characterizes over-smoothing and over-squashing effects in message-passing GNNs.
arXiv Detail & Related papers (2023-09-17T21:43:18Z) - Unifying over-smoothing and over-squashing in graph neural networks: A
physics informed approach and beyond [45.370565281567984]
Graph Neural Networks (GNNs) have emerged as one of the leading approaches for machine learning on graph-structured data.
critical computational challenges such as over-smoothing, over-squashing, and limited expressive power continue to impact the performance of GNNs.
We introduce the Multi-Scaled Heat Kernel based GNN (MHKG) by amalgamating diverse filtering functions' effects on node features.
arXiv Detail & Related papers (2023-09-06T06:22:18Z) - Controlling Text-to-Image Diffusion by Orthogonal Finetuning [74.21549380288631]
We introduce a principled finetuning method -- Orthogonal Finetuning (OFT) for adapting text-to-image diffusion models to downstream tasks.
Unlike existing methods, OFT can provably preserve hyperspherical energy which characterizes the pairwise neuron relationship on the unit hypersphere.
We empirically show that our OFT framework outperforms existing methods in generation quality and convergence speed.
arXiv Detail & Related papers (2023-06-12T17:59:23Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - AEGD: Adaptive Gradient Descent with Energy [0.0]
We propose AEGD, a new algorithm for first-order gradient non-energy objective functions variable.
We show energy-dependent AEGD for both non-energy convergence and desired small step size.
arXiv Detail & Related papers (2020-10-10T22:17:27Z) - Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences.
FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions.
One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.