Related papers: RectifiedHR: High-Resolution Diffusion via Energy Profiling and Adaptive Guidance Scheduling

RectifiedHR: High-Resolution Diffusion via Energy Profiling and Adaptive Guidance Scheduling

URL: http://arxiv.org/abs/2507.09441v1
Date: Sun, 13 Jul 2025 01:21:10 GMT
Title: RectifiedHR: High-Resolution Diffusion via Energy Profiling and Adaptive Guidance Scheduling
Authors: Ankit Sanjyal,
Abstract summary: High-resolution image synthesis with diffusion models often suffers from energy instabilities and guidance artifacts that degrade visual quality.<n>We analyze the latent energy landscape during sampling and propose adaptive classifier-free guidance (CFG) schedules that maintain stable energy trajectories.<n>Our approach introduces energy-aware scheduling strategies that modulate guidance strength over time, achieving superior stability scores (0.9998) and consistency metrics (0.9873) compared to fixed-guidance approaches.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: High-resolution image synthesis with diffusion models often suffers from energy instabilities and guidance artifacts that degrade visual quality. We analyze the latent energy landscape during sampling and propose adaptive classifier-free guidance (CFG) schedules that maintain stable energy trajectories. Our approach introduces energy-aware scheduling strategies that modulate guidance strength over time, achieving superior stability scores (0.9998) and consistency metrics (0.9873) compared to fixed-guidance approaches. We demonstrate that DPM++ 2M with linear-decreasing CFG scheduling yields optimal performance, providing sharper, more faithful images while reducing artifacts. Our energy profiling framework serves as a powerful diagnostic tool for understanding and improving diffusion model behavior.

Related papers

Learning from Heterogeneity: Generalizing Dynamic Facial Expression Recognition via Distributionally Robust Optimization [23.328511708942045]
Heterogeneity-aware Distributional Framework (HDF) designed to enhance time-frequency modeling and mitigate imbalance caused by hard samples.<n>Time-Frequency Distributional Attention Module (DAM) captures both temporal consistency and frequency robustness.<n> adaptive optimization module Distribution-aware Scaling Module (DSM) introduced to dynamically balance classification and contrastive losses.
arXiv Detail & Related papers (2025-07-21T16:21:47Z)
Dual-Expert Consistency Model for Efficient and High-Quality Video Generation [57.33788820909211]
We propose a parameter-efficient textbfDual-Expert Consistency Model(DCM), where a semantic expert focuses on learning semantic layout and motion, while a detail expert specializes in fine detail refinement.<n>Our approach achieves state-of-the-art visual quality with significantly reduced sampling steps, demonstrating the effectiveness of expert specialization in video diffusion model distillation.
arXiv Detail & Related papers (2025-06-03T17:55:04Z)
ROCM: RLHF on consistency models [8.905375742101707]
We propose a reward optimization framework for applying RLHF to consistency models.<n>We investigate various $f$-divergences as regularization strategies, striking a balance between reward and model consistency.
arXiv Detail & Related papers (2025-03-08T11:19:48Z)
Efficient Training-Free High-Resolution Synthesis with Energy Rectification in Diffusion Models [29.69501919628436]
Diffusion models have achieved remarkable progress across various visual generation tasks.<n>However, their performance significantly declines when generating content at resolutions higher than those used during training.<n>We propose RectifiedHR, a solution for training-free high-resolution synthesis.
arXiv Detail & Related papers (2025-03-04T12:03:26Z)
Self-Consistent Model-based Adaptation for Visual Reinforcement Learning [27.701421196547674]
Visual reinforcement learning agents face serious performance declines in real-world applications caused by visual distractions.<n>Existing methods rely on fine-tuning the policy's representations with hand-crafted augmentations.<n>We propose Self-Consistent Model-based Adaptation (SCMA), a novel method that fosters robust adaptation without modifying the policy.
arXiv Detail & Related papers (2025-02-14T05:23:56Z)
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints [51.83081671798784]
Diffusion Transformers (DiT) have emerged as a powerful architecture for image and video generation, offering superior quality and scalability.<n>DiT's practical application suffers from inherent dynamic feature instability, leading to error amplification during cached inference.<n>We propose Skip-DiT, an image and video generative DiT variant enhanced with Long-Skip-Connections (LSCs) - the key efficiency component in U-Nets.
arXiv Detail & Related papers (2024-11-26T17:28:10Z)
Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention [0.7770029179741429]
Conditional diffusion models have shown remarkable success in visual content generation. Recent attempts to extend unconditional guidance have relied on techniques, resulting in suboptimal generation quality. We propose Smoothed Energy Guidance (SEG), a novel training- and condition-free approach to enhance image generation.
arXiv Detail & Related papers (2024-08-01T17:59:09Z)
PASTA: Towards Flexible and Efficient HDR Imaging Via Progressively Aggregated Spatio-Temporal Alignment [91.38256332633544]
PASTA is a Progressively Aggregated Spatio-Temporal Alignment framework for HDR deghosting. Our approach achieves effectiveness and efficiency by harnessing hierarchical representation during feature distanglement. Experimental results showcase PASTA's superiority over current SOTA methods in both visual quality and performance metrics.
arXiv Detail & Related papers (2024-03-15T15:05:29Z)
Trajectory Consistency Distillation: Improved Latent Consistency Distillation by Semi-Linear Consistency Function with Trajectory Mapping [75.72212215739746]
Trajectory Consistency Distillation (TCD) encompasses trajectory consistency function and strategic sampling. TCD significantly enhances image quality at low NFEs but also yields more detailed results compared to the teacher model.
arXiv Detail & Related papers (2024-02-29T13:44:14Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)
Diffusion Probabilistic Model Made Slim [128.2227518929644]
We introduce a customized design for slim diffusion probabilistic models (DPM) for light-weight image synthesis. We achieve 8-18x computational complexity reduction as compared to the latent diffusion models on a series of conditional and unconditional image generation tasks.
arXiv Detail & Related papers (2022-11-27T16:27:28Z)
Uncovering the Over-smoothing Challenge in Image Super-Resolution: Entropy-based Quantification and Contrastive Optimization [67.99082021804145]
We propose an explicit solution to the COO problem, called Detail Enhanced Contrastive Loss (DECLoss) DECLoss utilizes the clustering property of contrastive learning to directly reduce the variance of the potential high-resolution distribution. We evaluate DECLoss on multiple super-resolution benchmarks and demonstrate that it improves the perceptual quality of PSNR-oriented models.
arXiv Detail & Related papers (2022-01-04T08:30:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.