UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache
- URL: http://arxiv.org/abs/2505.14010v1
- Date: Tue, 20 May 2025 07:04:34 GMT
- Title: UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache
- Authors: Pu Wang, Pengwen Dai, Chen Wu, Yeying Jin, Dianjie Lu, Guijuan Zhang, Youshan Zhang, Zhuoran Zheng,
- Abstract summary: We propose an efficient visual transformer framework for ultra-high-definition (UHD) image dehazing.<n>The proposed architecture improves the training convergence speed by textbf5 $times$ while reducing memory overhead.<n>Our approach maintains state-of-the-art dehazing quality while significantly improving computational efficiency for 4K/8K image restoration tasks.
- Score: 22.67146255766633
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose an efficient visual transformer framework for ultra-high-definition (UHD) image dehazing that addresses the key challenges of slow training speed and high memory consumption for existing methods. Our approach introduces two key innovations: 1) an \textbf{a}daptive \textbf{n}ormalization mechanism inspired by the nGPT architecture that enables ultra-fast and stable training with a network with a restricted range of parameter expressions; and 2) we devise an atmospheric scattering-aware KV caching mechanism that dynamically optimizes feature preservation based on the physical haze formation model. The proposed architecture improves the training convergence speed by \textbf{5 $\times$} while reducing memory overhead, enabling real-time processing of 50 high-resolution images per second on an RTX4090 GPU. Experimental results show that our approach maintains state-of-the-art dehazing quality while significantly improving computational efficiency for 4K/8K image restoration tasks. Furthermore, we provide a new dehazing image interpretable method with the help of an integrated gradient attribution map. Our code can be found here: https://anonymous.4open.science/r/anDehazeFormer-632E/README.md.
Related papers
- Learning Unpaired Image Dehazing with Physics-based Rehazy Generation [50.37414006427923]
Overfitting to synthetic training pairs remains a critical challenge in image dehazing.<n>We propose a novel training strategy for unpaired image dehazing, termed Rehazy, to improve both dehazing performance and training stability.
arXiv Detail & Related papers (2025-06-15T12:12:28Z) - ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration [75.0053551643052]
We introduce ZipIR, a novel framework that enhances efficiency, scalability, and long-range modeling for high-res image restoration.<n>ZipIR employs a highly compressed latent representation that compresses image 32x, effectively reducing the number of spatial tokens.<n>ZipIR surpasses existing diffusion-based methods, offering unmatched speed and quality in restoring high-resolution images from severely degraded inputs.
arXiv Detail & Related papers (2025-04-11T14:49:52Z) - Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image Enhancement [50.93686436282772]
We aim to delve into the limits of image enhancers both from visual quality and computational efficiency.<n>By rethinking the task demands, we build an explicit connection, i.e., visual quality and computational efficiency are corresponding to model learning and structure design.<n>Ultimately, this achieves efficient low-light image enhancement using only a single convolutional layer, while maintaining excellent visual quality.
arXiv Detail & Related papers (2025-02-27T08:20:03Z) - Directing Mamba to Complex Textures: An Efficient Texture-Aware State Space Model for Image Restoration [75.51789992466183]
TAMambaIR simultaneously perceives image textures achieves and a trade-off between performance and efficiency.<n>Extensive experiments on benchmarks for image super-resolution, deraining, and low-light image enhancement demonstrate that TAMambaIR achieves state-of-the-art performance with significantly improved efficiency.
arXiv Detail & Related papers (2025-01-27T23:53:49Z) - VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models [58.464465016269614]
We propose a novel framework for solving high-definition video inverse problems using latent image diffusion models.<n>Our approach delivers HD-resolution reconstructions in under 6 seconds per frame on a single NVIDIA 4090 GPU.
arXiv Detail & Related papers (2024-11-29T08:10:49Z) - Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient [52.96232442322824]
Collaborative Decoding (CoDe) is a novel efficient decoding strategy tailored for the Visual Auto-Regressive ( VAR) framework.<n>CoDe capitalizes on two critical observations: the substantially reduced parameter demands at larger scales and the exclusive generation patterns across different scales.<n>CoDe achieves a 1.7x speedup, slashes memory usage by around 50%, and preserves image quality with only a negligible FID increase from 1.95 to 1.98.
arXiv Detail & Related papers (2024-11-26T15:13:15Z) - TSFormer: A Robust Framework for Efficient UHD Image Restoration [7.487270862599671]
TSFormer is an all-in-one framework that integrates textbfTrusted learning with textbfSparsification.
Our model can run a 4K image in real time (40fps) with 3.38 M parameters.
arXiv Detail & Related papers (2024-11-17T03:34:27Z) - Towards Ultra-High-Definition Image Deraining: A Benchmark and An Efficient Method [42.331058889312466]
This paper contributes the first large-scale UHD image deraining dataset, 4K-Rain13k, that contains 13,000 image pairs at 4K resolution.
We develop an effective and efficient vision-based architecture (UDR-Mixer) to better solve this task.
arXiv Detail & Related papers (2024-05-27T11:45:08Z) - DehazeDCT: Towards Effective Non-Homogeneous Dehazing via Deformable Convolutional Transformer [43.807338032286346]
We introduce an innovative non-homogeneous Dehazing method via Deformable Convolutional Transformer-like architecture (DehazeDCT)
We first design a transformer-like network based on deformable convolution v4, which offers long-range dependency and adaptive spatial aggregation capabilities.
Furthermore, we leverage a lightweight Retinex-inspired transformer to achieve color correction and structure refinement.
arXiv Detail & Related papers (2024-05-24T10:59:18Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Dual-former: Hybrid Self-attention Transformer for Efficient Image
Restoration [6.611849560359801]
We present Dual-former, which combines the powerful global modeling ability of self-attention modules and the local modeling ability of convolutions in an overall architecture.
Experiments demonstrate that Dual-former achieves a 1.91dB gain over the state-of-the-art MAXIM method on the Indoor dataset for single image dehazing.
For single image deraining, it exceeds the SOTA method by 0.1dB PSNR on the average results of five datasets with only 21.5% GFLOPs.
arXiv Detail & Related papers (2022-10-03T16:39:21Z) - Towards Fast and Light-Weight Restoration of Dark Images [26.779714419544085]
We show that we can enhance a full resolution, 2848 x 4256, extremely dark single-image in the ballpark of 3 seconds even on a CPU.
We achieve this with 2 - 7x fewer model parameters, 2 - 3x lower memory utilization, 5 - 20x speed up and yet maintain a competitive image reconstruction quality.
arXiv Detail & Related papers (2020-11-28T13:53:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.