UniFL: Improve Stable Diffusion via Unified Feedback Learning
- URL: http://arxiv.org/abs/2404.05595v2
- Date: Wed, 22 May 2024 13:52:09 GMT
- Title: UniFL: Improve Stable Diffusion via Unified Feedback Learning
- Authors: Jiacheng Zhang, Jie Wu, Yuxi Ren, Xin Xia, Huafeng Kuang, Pan Xie, Jiashi Li, Xuefeng Xiao, Min Zheng, Lean Fu, Guanbin Li,
- Abstract summary: We present UniFL, a unified framework that leverages feedback learning to enhance diffusion models comprehensively.
UniFL incorporates three key components: perceptual feedback learning, which enhances visual quality; decoupled feedback learning, which improves aesthetic appeal; and adversarial feedback learning, which optimize inference speed.
In-depth experiments and extensive user studies validate the superior performance of our proposed method in enhancing both the quality of generated models and their acceleration.
- Score: 51.18278664629821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications. However, despite these significant advancements, the current competitive solutions still suffer from several limitations, including inferior visual quality, a lack of aesthetic appeal, and inefficient inference, without a comprehensive solution in sight. To address these challenges, we present UniFL, a unified framework that leverages feedback learning to enhance diffusion models comprehensively. UniFL stands out as a universal, effective, and generalizable solution applicable to various diffusion models, such as SD1.5 and SDXL. Notably, UniFL incorporates three key components: perceptual feedback learning, which enhances visual quality; decoupled feedback learning, which improves aesthetic appeal; and adversarial feedback learning, which optimizes inference speed. In-depth experiments and extensive user studies validate the superior performance of our proposed method in enhancing both the quality of generated models and their acceleration. For instance, UniFL surpasses ImageReward by 17% user preference in terms of generation quality and outperforms LCM and SDXL Turbo by 57% and 20% in 4-step inference. Moreover, we have verified the efficacy of our approach in downstream tasks, including Lora, ControlNet, and AnimateDiff.
Related papers
- DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
We propose a novel IQA method called diffusion priors-based IQA (DP-IQA)
We use pre-trained stable diffusion as the backbone, extract multi-level features from the denoising U-Net, and decode them to estimate the image quality score.
We distill the knowledge in the above model into a CNN-based student model, significantly reducing the parameter to enhance applicability.
arXiv Detail & Related papers (2024-05-30T12:32:35Z) - Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory [27.651921957220004]
We introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM)
We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL.
We also introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples.
arXiv Detail & Related papers (2024-05-22T20:59:18Z) - Navigating Heterogeneity and Privacy in One-Shot Federated Learning with Diffusion Models [6.921070916461661]
Federated learning (FL) enables multiple clients to train models collectively while preserving data privacy.
One-shot federated learning has emerged as a solution by reducing communication rounds, improving efficiency, and providing better security against eavesdropping attacks.
arXiv Detail & Related papers (2024-05-02T17:26:52Z) - Sample Less, Learn More: Efficient Action Recognition via Frame Feature
Restoration [59.6021678234829]
We propose a novel method to restore the intermediate features for two sparsely sampled and adjacent video frames.
With the integration of our method, the efficiency of three commonly used baselines has been improved by over 50%, with a mere 0.5% reduction in recognition accuracy.
arXiv Detail & Related papers (2023-07-27T13:52:42Z) - LLDiffusion: Learning Degradation Representations in Diffusion Models
for Low-Light Image Enhancement [118.83316133601319]
Current deep learning methods for low-light image enhancement (LLIE) typically rely on pixel-wise mapping learned from paired data.
We propose a degradation-aware learning scheme for LLIE using diffusion models, which effectively integrates degradation and image priors into the diffusion process.
arXiv Detail & Related papers (2023-07-27T07:22:51Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - Training Diffusion Models with Reinforcement Learning [82.29328477109826]
Diffusion models are trained with an approximation to the log-likelihood objective.
In this paper, we investigate reinforcement learning methods for directly optimizing diffusion models for downstream objectives.
We describe how posing denoising as a multi-step decision-making problem enables a class of policy gradient algorithms.
arXiv Detail & Related papers (2023-05-22T17:57:41Z) - Learning Enhancement From Degradation: A Diffusion Model For Fundus
Image Enhancement [21.91300560770087]
We introduce a novel diffusion model based framework, named Learning Enhancement from Degradation (LED)
LED learns degradation mappings from unpaired high-quality to low-quality images.
LED is able to output enhancement results that maintain clinically important features with better clarity.
arXiv Detail & Related papers (2023-03-08T14:14:49Z) - AnycostFL: Efficient On-Demand Federated Learning over Heterogeneous
Edge Devices [20.52519915112099]
We propose a cost-adjustable FL framework, named AnycostFL, that enables diverse edge devices to efficiently perform local updates.
Experiment results indicate that, our learning framework can reduce up to 1.9 times of the training latency and energy consumption for realizing a reasonable global testing accuracy.
arXiv Detail & Related papers (2023-01-08T15:25:55Z) - Multi-View Attention Transfer for Efficient Speech Enhancement [1.6932706284468382]
We propose multi-view attention transfer (MV-AT), a feature-based distillation, to obtain efficient speech enhancement models in the time domain.
Based on the multi-view features extraction model, MV-AT transfers multi-view knowledge of the teacher network to the student network without additional parameters.
arXiv Detail & Related papers (2022-08-22T14:47:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.