Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
- URL: http://arxiv.org/abs/2404.13686v3
- Date: Mon, 04 Nov 2024 13:24:18 GMT
- Title: Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
- Authors: Yuxi Ren, Xin Xia, Yanzuo Lu, Jiacheng Zhang, Jie Wu, Pan Xie, Xing Wang, Xuefeng Xiao,
- Abstract summary: Hyper-SD is a novel framework that amalgamates the advantages of ODE Trajectory Preservation and Reformulation.
We introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments.
We incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process.
- Score: 20.2271205957037
- License:
- Abstract: Recently, a series of diffusion-aware distillation algorithms have emerged to alleviate the computational overhead associated with the multi-step inference process of Diffusion Models (DMs). Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation. However, these approaches suffer from severe performance degradation or domain shifts. To address these limitations, we propose Hyper-SD, a novel framework that synergistically amalgamates the advantages of ODE Trajectory Preservation and Reformulation, while maintaining near-lossless performance during step compression. Firstly, we introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments, which facilitates the preservation of the original ODE trajectory from a higher-order perspective. Secondly, we incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process. Thirdly, we integrate score distillation to further improve the low-step generation capability of the model and offer the first attempt to leverage a unified LoRA to support the inference process at all steps. Extensive experiments and user studies demonstrate that Hyper-SD achieves SOTA performance from 1 to 8 inference steps for both SDXL and SD1.5. For example, Hyper-SDXL surpasses SDXL-Lightning by +0.68 in CLIP Score and +0.51 in Aes Score in the 1-step inference.
Related papers
- Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution [81.81748032199813]
We propose a Distillation-Free One-Step Diffusion model.
Specifically, we propose a noise-aware discriminator (NAD) to participate in adversarial training.
We improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - EM Distillation for One-step Diffusion Models [65.57766773137068]
We propose a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of quality.
We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process.
arXiv Detail & Related papers (2024-05-27T05:55:22Z) - Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model.
For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space.
We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z) - Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation [61.03530321578825]
We introduce Score identity Distillation (SiD), an innovative data-free method that distills the generative capabilities of pretrained diffusion models into a single-step generator.
SiD not only facilitates an exponentially fast reduction in Fr'echet inception distance (FID) during distillation but also approaches or even exceeds the FID performance of the original teacher diffusion models.
arXiv Detail & Related papers (2024-04-05T12:30:19Z) - Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation [24.236841051249243]
Distillation methods aim to shift the model from many-shot to single-step inference.
We introduce Latent Adversarial Diffusion Distillation (LADD), a novel distillation approach overcoming the limitations of ADD.
In contrast to pixel-based ADD, LADD utilizes generative features from pretrained latent diffusion models.
arXiv Detail & Related papers (2024-03-18T17:51:43Z) - Adversarial Diffusion Distillation [18.87099764514747]
Adversarial Diffusion Distillation (ADD) is a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps.
We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal.
Our model clearly outperforms existing few-step methods in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps.
arXiv Detail & Related papers (2023-11-28T18:53:24Z) - SinSR: Diffusion-Based Image Super-Resolution in a Single Step [119.18813219518042]
Super-resolution (SR) methods based on diffusion models exhibit promising results.
But their practical application is hindered by the substantial number of required inference steps.
We propose a simple yet effective method for achieving single-step SR generation, named SinSR.
arXiv Detail & Related papers (2023-11-23T16:21:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.