RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration
- URL: http://arxiv.org/abs/2505.18047v2
- Date: Sat, 25 Oct 2025 22:06:16 GMT
- Title: RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration
- Authors: Sudarshan Rajagopalan, Kartik Narayan, Vishal M. Patel,
- Abstract summary: latent diffusion models (LDMs) have improved the perceptual quality of All-in-One image Restoration (AiOR) methods.<n>LDMs suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications.<n>Visual autoregressive modeling ( VAR) performs scale-space autoregression and achieves comparable performance to that of state-of-the-art diffusion transformers.
- Score: 51.77917733024544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of latent diffusion models (LDMs) such as Stable Diffusion has significantly improved the perceptual quality of All-in-One image Restoration (AiOR) methods, while also enhancing their generalization capabilities. However, these LDM-based frameworks suffer from slow inference due to their iterative denoising process, rendering them impractical for time-sensitive applications. Visual autoregressive modeling (VAR), a recently introduced approach for image generation, performs scale-space autoregression and achieves comparable performance to that of state-of-the-art diffusion transformers with drastically reduced computational costs. Moreover, our analysis reveals that coarse scales in VAR primarily capture degradations while finer scales encode scene detail, simplifying the restoration process. Motivated by this, we propose RestoreVAR, a novel VAR-based generative approach for AiOR that significantly outperforms LDM-based models in restoration performance while achieving over $10\times$ faster inference. To optimally exploit the advantages of VAR for AiOR, we propose architectural modifications and improvements, including intricately designed cross-attention mechanisms and a latent-space refinement module, tailored for the AiOR task. Extensive experiments show that RestoreVAR achieves state-of-the-art performance among generative AiOR methods, while also exhibiting strong generalization capabilities.
Related papers
- DreamVAR: Taming Reinforced Visual Autoregressive Model for High-Fidelity Subject-Driven Image Generation [108.71044040025374]
We present a novel framework for subject-driven image synthesis built upon a Visual Autoregressive model that employs next-scale prediction.<n>We show that Dreamthe achieves superior appearance preservation compared to leading diffusion-based methods.
arXiv Detail & Related papers (2026-01-30T03:32:29Z) - Scale-Wise VAR is Secretly Discrete Diffusion [48.994983608261286]
Next scale prediction Visual Autoregressive Generation ( VAR) has recently demonstrated remarkable performance, even surpassing diffusion-based models.<n>In this work, we revisit VAR and uncover a theoretical insight: when equipped with a Markovian attention mask, VAR is mathematically equivalent to a discrete diffusion.<n>We show how one can directly import the advantages of diffusion such as iterative refinement and reduce architectural inefficiencies into VAR, yielding faster convergence, lower inference cost, and improved zero-shot reconstruction.
arXiv Detail & Related papers (2025-09-26T17:58:04Z) - TinySR: Pruning Diffusion for Real-World Image Super-Resolution [35.07163534857897]
We present TinySR, a compact yet effective diffusion model specifically designed for Real-ISR.<n>TinySR significantly reduces computational cost and model size, achieving up to 5.68x speedup and 83% parameter reduction compared to its teacher TSD-SR.
arXiv Detail & Related papers (2025-08-24T16:17:33Z) - AEDR: Training-Free AI-Generated Image Attribution via Autoencoder Double-Reconstruction [25.525545133210805]
AEDR (AutoEncoder Double-Reconstruction) is a training-free attribution method designed for generative models with continuous autoencoders.<n>It achieves 25.5% higher attribution accuracy than existing reconstruction-based methods, while requiring only 1% of the computational time.
arXiv Detail & Related papers (2025-07-25T06:34:58Z) - Multi-Step Guided Diffusion for Image Restoration on Edge Devices: Toward Lightweight Perception in Embodied AI [0.0]
We introduce a multistep optimization strategy within each denoising timestep, significantly enhancing image quality, perceptual accuracy, and generalization.<n>Our experiments on super-resolution and Gaussian deblurring demonstrate that increasing the number of gradient updates per step improves LPIPS and PSNR with minimal latency overhead.<n>Our findings highlight MPGD's potential as a lightweight, plug-and-play restoration module for real-time visual perception in embodied AI agents such as drones and mobile robots.
arXiv Detail & Related papers (2025-06-08T21:11:25Z) - Enhancing Variational Autoencoders with Smooth Robust Latent Encoding [54.74721202894622]
Variational Autoencoders (VAEs) have played a key role in scaling up diffusion-based generative models.<n>We introduce Smooth Robust Latent VAE, a novel adversarial training framework that boosts both generation quality and robustness.<n>Experiments show that SRL-VAE improves both generation quality, in image reconstruction and text-guided image editing, and robustness, against Nightshade attacks and image editing attacks.
arXiv Detail & Related papers (2025-04-24T03:17:57Z) - ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration [75.0053551643052]
We introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration.<n>ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens.<n>ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.
arXiv Detail & Related papers (2025-04-11T14:49:52Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - Navigating Image Restoration with VAR's Distribution Alignment Prior [6.0648320320309885]
VAR, a novel image generative paradigm, surpasses diffusion models in generation quality by applying a next-scale prediction approach.<n>We formulate the multi-scale latent representations within VAR as the restoration prior, thus advancing our delicately designed VarFormer framework.
arXiv Detail & Related papers (2024-12-30T16:32:55Z) - RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution [36.137383171027615]
We introduce RAP-SR, a restoration prior enhancement approach in pretrained diffusion models for Real-SR.<n>First, we develop the High-Fidelity Aesthetic Image dataset (HFAID), curated through a Quality-Driven Aesthetic Image Selection Pipeline (QDAISP)<n>Second, we propose the Restoration Priors Enhancement Framework, which includes Restoration Priors Refinement (RPR) and Restoration-Oriented Prompt Optimization (ROPO) modules.
arXiv Detail & Related papers (2024-12-10T03:17:38Z) - Effective Diffusion Transformer Architecture for Image Super-Resolution [63.254644431016345]
We design an effective diffusion transformer for image super-resolution (DiT-SR)
In practice, DiT-SR leverages an overall U-shaped architecture, and adopts a uniform isotropic design for all the transformer blocks.
We analyze the limitation of the widely used AdaLN, and present a frequency-adaptive time-step conditioning module.
arXiv Detail & Related papers (2024-09-29T07:14:16Z) - Taming Generative Diffusion Prior for Universal Blind Image Restoration [4.106012295148947]
BIR-D is able to fulfill multi-guidance blind image restoration.
It can also restore images that undergo multiple and complicated degradations, demonstrating the practical applications.
arXiv Detail & Related papers (2024-08-21T02:19:54Z) - Efficient Degradation-aware Any Image Restoration [83.92870105933679]
We propose textitDaAIR, an efficient All-in-One image restorer employing a Degradation-aware Learner (DaLe) in the low-rank regime.
By dynamically allocating model capacity to input degradations, we realize an efficient restorer integrating holistic and specific learning.
arXiv Detail & Related papers (2024-05-24T11:53:27Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.