Related papers: Global Modeling Matters: A Fast, Lightweight and Effective Baseline for Efficient Image Restoration

Global Modeling Matters: A Fast, Lightweight and Effective Baseline for Efficient Image Restoration

URL: http://arxiv.org/abs/2507.13663v1
Date: Fri, 18 Jul 2025 05:15:04 GMT
Title: Global Modeling Matters: A Fast, Lightweight and Effective Baseline for Efficient Image Restoration
Authors: Xingyu Jiang, Ning Gao, Hongkun Dou, Xiuhui Zhang, Xiaoqing Zhong, Yue Deng, Hongjue Li,
Abstract summary: Pyramid Wavelet-Fourier Network (PW-FNet) is an efficient restoration baseline for image restoration.<n>PW-FNet features multi-input multi-output structure to achieve multi-scale and multi-frequency bands decomposition.<n>Experiments on tasks such as image deraining, raindrop removal, image super-resolution, motion deblurring, image dehazing and underwater/low-light enhancement demonstrate that PW-FNet not only surpasses state-of-the-art methods in restoration quality but also achieves superior efficiency.
Score: 9.2933763571933
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Natural image quality is often degraded by adverse weather conditions, significantly impairing the performance of downstream tasks. Image restoration has emerged as a core solution to this challenge and has been widely discussed in the literature. Although recent transformer-based approaches have made remarkable progress in image restoration, their increasing system complexity poses significant challenges for real-time processing, particularly in real-world deployment scenarios. To this end, most existing methods attempt to simplify the self-attention mechanism, such as by channel self-attention or state space model. However, these methods primarily focus on network architecture while neglecting the inherent characteristics of image restoration itself. In this context, we explore a pyramid Wavelet-Fourier iterative pipeline to demonstrate the potential of Wavelet-Fourier processing for image restoration. Inspired by the above findings, we propose a novel and efficient restoration baseline, named Pyramid Wavelet-Fourier Network (PW-FNet). Specifically, PW-FNet features two key design principles: 1) at the inter-block level, integrates a pyramid wavelet-based multi-input multi-output structure to achieve multi-scale and multi-frequency bands decomposition; and 2) at the intra-block level, incorporates Fourier transforms as an efficient alternative to self-attention mechanisms, effectively reducing computational complexity while preserving global modeling capability. Extensive experiments on tasks such as image deraining, raindrop removal, image super-resolution, motion deblurring, image dehazing, image desnowing and underwater/low-light enhancement demonstrate that PW-FNet not only surpasses state-of-the-art methods in restoration quality but also achieves superior efficiency, with significantly reduced parameter size, computational cost and inference time.

Related papers

Frequency-Domain Fusion Transformer for Image Inpainting [6.4194162137514725]
This paper proposes a Transformer-based image inpainting method incorporating frequency-domain fusion.<n> Experimental results demonstrate that the proposed method effectively improves the quality of image inpainting by preserving more high-frequency information.
arXiv Detail & Related papers (2025-06-23T09:19:04Z)
Breaking Complexity Barriers: High-Resolution Image Restoration with Rank Enhanced Linear Attention [54.42902794496325]
Linear attention, a variant of softmax attention, demonstrates promise in global context modeling.<n>We propose Rank Enhanced Linear Attention (RELA), a simple yet effective method that enriches feature representations by integrating a lightweight depthwise convolution.<n>Building upon RELA, we propose an efficient and effective image restoration Transformer, named LAformer.
arXiv Detail & Related papers (2025-05-22T02:57:23Z)
Image Restoration via Multi-domain Learning [8.909636477353695]
We introduce a novel restoration framework, which integrates multi-domain learning into Transformer.<n>Specifically, in Token Mixer, we propose a Spatial-Wavelet-Fourier multi-domain structure that facilitates local-region-global multi-receptive field modeling.<n>In Feed-Forward Network, we incorporate multi-scale learning to fuse multi-domain features at different resolutions.
arXiv Detail & Related papers (2025-05-07T04:14:51Z)
Joint multi-dimensional dynamic attention and transformer for general image restoration [14.987034136856463]
outdoor images often suffer from severe degradation due to rain, haze, and noise. Current image restoration methods struggle to handle complex degradation while maintaining efficiency. This paper introduces a novel image restoration architecture that combines multi-dimensional dynamic attention and self-attention.
arXiv Detail & Related papers (2024-11-12T15:58:09Z)
Multi-Scale Representation Learning for Image Restoration with State-Space Model [13.622411683295686]
We propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration. Our proposed method achieves new state-of-the-art performance while maintaining low computational complexity.
arXiv Detail & Related papers (2024-08-19T16:42:58Z)
WaveletFormerNet: A Transformer-based Wavelet Network for Real-world Non-homogeneous and Dense Fog Removal [11.757602977709517]
This paper proposes a Transformer-based wavelet network (WaveletFormerNet) for real-world foggy image recovery. We introduce parallel convolution in the Transformer block, which allows for the capture of multi-frequency information in a lightweight mechanism. Our experiments demonstrate that our WaveletFormerNet performs better than state-of-the-art methods.
arXiv Detail & Related papers (2024-01-09T13:42:21Z)
DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement [75.25451566988565]
We propose a novel Gated Multi-Resolution Transfer Network (GMTNet) to reconstruct a spatially precise high-quality image from a burst of low-quality raw images. Detailed experimental analysis on five datasets validates our approach and sets a state-of-the-art for burst super-resolution, burst denoising, and low-light burst enhancement.
arXiv Detail & Related papers (2023-04-13T17:54:00Z)
Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data. Transformers have shown significant performance gains on natural language and high-level vision tasks. Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z)
Fourier Space Losses for Efficient Perceptual Image Super-Resolution [131.50099891772598]
We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions. We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality. The trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
arXiv Detail & Related papers (2021-06-01T20:34:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.