Related papers: RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration

RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration

URL: http://arxiv.org/abs/2505.18047v1
Date: Fri, 23 May 2025 15:52:26 GMT
Title: RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration
Authors: Sudarshan Rajagopalan, Kartik Narayan, Vishal M. Patel,
Abstract summary: latent diffusion models (LDMs) have significantly improved the perceptual quality of All-in-One image Restoration (AiOR) methods.<n>These LDM-based frameworks suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications.<n>We propose a novel generative approach for AiOR that significantly outperforms LDM-based models in restoration performance while achieving over $mathbf10times$ faster inference.
Score: 27.307331773270676
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The use of latent diffusion models (LDMs) such as Stable Diffusion has significantly improved the perceptual quality of All-in-One image Restoration (AiOR) methods, while also enhancing their generalization capabilities. However, these LDM-based frameworks suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications. To address this, we propose RestoreVAR, a novel generative approach for AiOR that significantly outperforms LDM-based models in restoration performance while achieving over $\mathbf{10\times}$ faster inference. RestoreVAR leverages visual autoregressive modeling (VAR), a recently introduced approach which performs scale-space autoregression for image generation. VAR achieves comparable performance to that of state-of-the-art diffusion transformers with drastically reduced computational costs. To optimally exploit these advantages of VAR for AiOR, we propose architectural modifications and improvements, including intricately designed cross-attention mechanisms and a latent-space refinement module, tailored for the AiOR task. Extensive experiments show that RestoreVAR achieves state-of-the-art performance among generative AiOR methods, while also exhibiting strong generalization capabilities.

Related papers

AEDR: Training-Free AI-Generated Image Attribution via Autoencoder Double-Reconstruction [25.525545133210805]
AEDR (AutoEncoder Double-Reconstruction) is a training-free attribution method designed for generative models with continuous autoencoders.<n>It achieves 25.5% higher attribution accuracy than existing reconstruction-based methods, while requiring only 1% of the computational time.
arXiv Detail & Related papers (2025-07-25T06:34:58Z)
Multi-Step Guided Diffusion for Image Restoration on Edge Devices: Toward Lightweight Perception in Embodied AI [0.0]
We introduce a multistep optimization strategy within each denoising timestep, significantly enhancing image quality, perceptual accuracy, and generalization.<n>Our experiments on super-resolution and Gaussian deblurring demonstrate that increasing the number of gradient updates per step improves LPIPS and PSNR with minimal latency overhead.<n>Our findings highlight MPGD's potential as a lightweight, plug-and-play restoration module for real-time visual perception in embodied AI agents such as drones and mobile robots.
arXiv Detail & Related papers (2025-06-08T21:11:25Z)
Enhancing Variational Autoencoders with Smooth Robust Latent Encoding [54.74721202894622]
Variational Autoencoders (VAEs) have played a key role in scaling up diffusion-based generative models.<n>We introduce Smooth Robust Latent VAE, a novel adversarial training framework that boosts both generation quality and robustness.<n>Experiments show that SRL-VAE improves both generation quality, in image reconstruction and text-guided image editing, and robustness, against Nightshade attacks and image editing attacks.
arXiv Detail & Related papers (2025-04-24T03:17:57Z)
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration [75.0053551643052]
We introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration.<n>ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens.<n>ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.
arXiv Detail & Related papers (2025-04-11T14:49:52Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
Navigating Image Restoration with VAR's Distribution Alignment Prior [6.0648320320309885]
VAR, a novel image generative paradigm, surpasses diffusion models in generation quality by applying a next-scale prediction approach.<n>We formulate the multi-scale latent representations within VAR as the restoration prior, thus advancing our delicately designed VarFormer framework.
arXiv Detail & Related papers (2024-12-30T16:32:55Z)
RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution [36.137383171027615]
We introduce RAP-SR, a restoration prior enhancement approach in pretrained diffusion models for Real-SR.<n>First, we develop the High-Fidelity Aesthetic Image dataset (HFAID), curated through a Quality-Driven Aesthetic Image Selection Pipeline (QDAISP)<n>Second, we propose the Restoration Priors Enhancement Framework, which includes Restoration Priors Refinement (RPR) and Restoration-Oriented Prompt Optimization (ROPO) modules.
arXiv Detail & Related papers (2024-12-10T03:17:38Z)
Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR) In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks. We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z)
Taming Generative Diffusion Prior for Universal Blind Image Restoration [4.106012295148947]
BIR-D is able to fulfill multi-guidance blind image restoration. It can also restore images that undergo multiple and complicated degradations, demonstrating the practical applications.
arXiv Detail & Related papers (2024-08-21T02:19:54Z)
Efficient Degradation-aware Any Image Restoration [83.92870105933679]
We propose textitDaAIR, an efficient All-in-One image restorer employing a Degradation-aware Learner (DaLe) in the low-rank regime. By dynamically allocating model capacity to input degradations, we realize an efficient restorer integrating holistic and specific learning.
arXiv Detail & Related papers (2024-05-24T11:53:27Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.