Related papers: WaveHiT-SR: Hierarchical Wavelet Network for Efficient Image Super-Resolution

WaveHiT-SR: Hierarchical Wavelet Network for Efficient Image Super-Resolution

URL: http://arxiv.org/abs/2508.19927v1
Date: Wed, 27 Aug 2025 14:37:50 GMT
Title: WaveHiT-SR: Hierarchical Wavelet Network for Efficient Image Super-Resolution
Authors: Fayaz Ali, Muhammad Zawish, Steven Davy, Radu Timofte,
Abstract summary: We propose a new approach by embedding the wavelet transform within a hierarchical transformer framework, called (WaveHiT-SR)<n>By progressively reconstructing high-resolution images through hierarchical processing, the network reduces computational complexity without sacrificing performance.<n>Our refined versions of SwinIR-Light, SwinIR-NG, and SRFormer-Light deliver cutting-edge SR results, achieving higher efficiency with fewer parameters, lower FLOPs, and faster speeds.
Score: 44.55918322585521
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Transformers have demonstrated promising performance in computer vision tasks, including image super-resolution (SR). The quadratic computational complexity of window self-attention mechanisms in many transformer-based SR methods forces the use of small, fixed windows, limiting the receptive field. In this paper, we propose a new approach by embedding the wavelet transform within a hierarchical transformer framework, called (WaveHiT-SR). First, using adaptive hierarchical windows instead of static small windows allows to capture features across different levels and greatly improve the ability to model long-range dependencies. Secondly, the proposed model utilizes wavelet transforms to decompose images into multiple frequency subbands, allowing the network to focus on both global and local features while preserving structural details. By progressively reconstructing high-resolution images through hierarchical processing, the network reduces computational complexity without sacrificing performance. The multi-level decomposition strategy enables the network to capture fine-grained information in lowfrequency components while enhancing high-frequency textures. Through extensive experimentation, we confirm the effectiveness and efficiency of our WaveHiT-SR. Our refined versions of SwinIR-Light, SwinIR-NG, and SRFormer-Light deliver cutting-edge SR results, achieving higher efficiency with fewer parameters, lower FLOPs, and faster speeds.

Related papers

FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z)
Dual-domain Modulation Network for Lightweight Image Super-Resolution [26.992373105057684]
Lightweight image super-resolution (SR) aims to reconstruct high-resolution images from low-resolution images under limited computational costs.<n>Existing frequency-based SR methods cannot balance the reconstruction of overall structures and high-frequency parts.<n>We show that introducing both wavelet and Fourier information allows our model to consider both high-frequency features and overall SR structure reconstruction while reducing costs.
arXiv Detail & Related papers (2025-03-13T04:59:46Z)
HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution [70.52256118833583]
We present a strategy to convert transformer-based SR networks to hierarchical transformers (HiT-SR) Specifically, we first replace the commonly used fixed small windows with expanding hierarchical windows to aggregate features at different scales. Considering the intensive computation required for large windows, we further design a spatial-channel correlation method with linear complexity to window sizes.
arXiv Detail & Related papers (2024-07-08T12:42:10Z)
CFAT: Unleashing TriangularWindows for Image Super-resolution [5.130320840059732]
Transformer-based models have revolutionized the field of image super-resolution (SR) We propose a non-overlapping triangular window technique that synchronously works with the rectangular one to mitigate boundary-level distortion. Our proposed model shows a significant 0.7 dB performance improvement over other state-of-the-art SR architectures.
arXiv Detail & Related papers (2024-03-24T13:31:31Z)
WaveMixSR: A Resource-efficient Neural Network for Image Super-resolution [2.0477182014909205]
We propose a new neural network -- WaveMixSR -- for image super-resolution based on WaveMix architecture. WaveMixSR achieves competitive performance in all datasets and reaches state-of-the-art performance in the BSD100 dataset on multiple super-resolution tasks.
arXiv Detail & Related papers (2023-07-01T21:25:03Z)
Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR) Most deep CNN-based SR models take massive computations to obtain high performance. We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z)
Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years. We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution. Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.