LatentCRF: Continuous CRF for Efficient Latent Diffusion
- URL: http://arxiv.org/abs/2412.18596v1
- Date: Tue, 24 Dec 2024 18:51:11 GMT
- Title: LatentCRF: Continuous CRF for Efficient Latent Diffusion
- Authors: Kanchana Ranasinghe, Sadeep Jayasumana, Andreas Veit, Ayan Chakrabarti, Daniel Glasner, Michael S Ryoo, Srikumar Ramalingam, Sanjiv Kumar,
- Abstract summary: We introduce LatentCRF, a continuous Conditional Random Field (CRF) model, implemented as a neural network layer.
LatentCRF models the spatial and semantic relationships among the latent vectors in the Latent Diffusion Models.
We increase inference efficiency by 33% with no loss in image quality or diversity compared to the full LDM.
- Score: 63.994037257284205
- License:
- Abstract: Latent Diffusion Models (LDMs) produce high-quality, photo-realistic images, however, the latency incurred by multiple costly inference iterations can restrict their applicability. We introduce LatentCRF, a continuous Conditional Random Field (CRF) model, implemented as a neural network layer, that models the spatial and semantic relationships among the latent vectors in the LDM. By replacing some of the computationally-intensive LDM inference iterations with our lightweight LatentCRF, we achieve a superior balance between quality, speed and diversity. We increase inference efficiency by 33% with no loss in image quality or diversity compared to the full LDM. LatentCRF is an easy add-on, which does not require modifying the LDM.
Related papers
- InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration [106.70903819362402]
Diffusion priors have been used for blind face restoration (BFR) by fine-tuning diffusion models (DMs) on restoration datasets to recover low-quality images.
We propose InterLCM to leverage the latent consistency model (LCM) for its superior semantic consistency and efficiency.
InterLCM outperforms existing approaches in both synthetic and real-world datasets while also achieving faster inference speed.
arXiv Detail & Related papers (2025-02-04T10:51:20Z) - Boosting Latent Diffusion with Perceptual Objectives [29.107038084215514]
Latent diffusion models (LDMs) power state-of-the-art high-resolution generative image models.
We propose to leverage the internal features of the decoder to define a latent perceptual loss (LPL)
This loss encourages the models to create sharper and more realistic images.
arXiv Detail & Related papers (2024-11-06T16:28:21Z) - A Flow-based Truncated Denoising Diffusion Model for Super-resolution Magnetic Resonance Spectroscopic Imaging [34.32290273033808]
This work introduces a Flow-based Truncated Denoising Diffusion Model for super-resolution MRSI.
It shortens the diffusion process by truncating the diffusion chain, and the truncated steps are estimated using a normalizing flow-based network.
We demonstrate that FTDDM outperforms existing generative models while speeding up the sampling process by over 9-fold.
arXiv Detail & Related papers (2024-10-25T03:42:35Z) - RCDM: Enabling Robustness for Conditional Diffusion Model [2.4915590770454035]
Conditional diffusion model (CDM) enhances the standard diffusion model by providing more control.
Inaccurate conditional inputs in the inverse process of CDM can easily lead to generating fixed errors in the neural network.
We propose a lightweight solution, the Robust Conditional Diffusion Model (RCDM)
arXiv Detail & Related papers (2024-08-05T13:12:57Z) - Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model.
For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space.
We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z) - LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights [2.8461446020965435]
We introduce LD-Pruner, a novel performance-preserving structured pruning method for compressing Latent Diffusion Models.
We demonstrate the effectiveness of our approach on three different tasks: text-to-image (T2I) generation, Unconditional Image Generation (UIG) and Unconditional Audio Generation (UAG)
arXiv Detail & Related papers (2024-04-18T06:35:37Z) - Latent Consistency Models: Synthesizing High-Resolution Images with
Few-Step Inference [60.32804641276217]
We propose Latent Consistency Models (LCMs), enabling swift inference with minimal steps on any pre-trained LDMs.
A high-quality 768 x 768 24-step LCM takes only 32 A100 GPU hours for training.
We also introduce Latent Consistency Fine-tuning (LCF), a novel method that is tailored for fine-tuning LCMs on customized image datasets.
arXiv Detail & Related papers (2023-10-06T17:11:58Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - Diffusion Probabilistic Model Made Slim [128.2227518929644]
We introduce a customized design for slim diffusion probabilistic models (DPM) for light-weight image synthesis.
We achieve 8-18x computational complexity reduction as compared to the latent diffusion models on a series of conditional and unconditional image generation tasks.
arXiv Detail & Related papers (2022-11-27T16:27:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.