GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution
- URL: http://arxiv.org/abs/2408.07484v1
- Date: Wed, 14 Aug 2024 11:56:35 GMT
- Title: GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution
- Authors: Yuzhen Li, Zehang Deng, Yuxin Cao, Lihua Liu,
- Abstract summary: Grouped Residual Self-Attention (GRSA) is specifically oriented towards two fundamental components.
ES-RPB is a substitute for the original relative position bias to improve the ability to represent position information.
Experiments demonstrate GRFormer outperforms state-of-the-art transformer-based methods for $times$2, $times$3 and $times$4 SISR tasks.
- Score: 2.312414367096445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous works have shown that reducing parameter overhead and computations for transformer-based single image super-resolution (SISR) models (e.g., SwinIR) usually leads to a reduction of performance. In this paper, we present GRFormer, an efficient and lightweight method, which not only reduces the parameter overhead and computations, but also greatly improves performance. The core of GRFormer is Grouped Residual Self-Attention (GRSA), which is specifically oriented towards two fundamental components. Firstly, it introduces a novel grouped residual layer (GRL) to replace the Query, Key, Value (QKV) linear layer in self-attention, aimed at efficiently reducing parameter overhead, computations, and performance loss at the same time. Secondly, it integrates a compact Exponential-Space Relative Position Bias (ES-RPB) as a substitute for the original relative position bias to improve the ability to represent position information while further minimizing the parameter count. Extensive experimental results demonstrate that GRFormer outperforms state-of-the-art transformer-based methods for $\times$2, $\times$3 and $\times$4 SISR tasks, notably outperforming SOTA by a maximum PSNR of 0.23dB when trained on the DIV2K dataset, while reducing the number of parameter and MACs by about \textbf{60\%} and \textbf{49\% } in only self-attention module respectively. We hope that our simple and effective method that can easily applied to SR models based on window-division self-attention can serve as a useful tool for further research in image super-resolution. The code is available at \url{https://github.com/sisrformer/GRFormer}.
Related papers
- Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution [10.074968164380314]
Implicit Neural Representation (INR) has been successfully employed for Arbitrary-scale Super-Resolution (ASR)
It is computationally expensive to query numerous times to render each pixel.
Recently, Gaussian Splatting (GS) has shown its advantages over INR in both visual quality and rendering speed in 3D tasks.
arXiv Detail & Related papers (2025-01-12T15:14:58Z) - Adaptive Principal Components Allocation with the $\ell_{2,g}$-regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models [7.6656660956453635]
We propose a novel -Efficient Fine-ning (PEFT) approach based on Gaussian Graphical Models (GGMs)
We demonstrate the effectiveness of the proposed approach, achieving competitive performance with significantly fewer trainable parameters.
arXiv Detail & Related papers (2024-12-11T18:11:21Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.
Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Towards an Effective and Efficient Transformer for Rain-by-snow Weather
Removal [23.224536745724077]
Rain-by-snow weather removal is a specialized task in weather-degraded image restoration aiming to eliminate coexisting rain streaks and snow particles.
We propose RSFormer, an efficient and effective Transformer that addresses this challenge.
RSFormer achieves the best trade-off between performance and time-consumption compared to other restoration methods.
arXiv Detail & Related papers (2023-04-06T04:39:23Z) - SRFormerV2: Taking a Closer Look at Permuted Self-Attention for Image Super-Resolution [74.48610723198514]
We present SRFormer, a simple but novel method that can enjoy the benefit of large window self-attention.
Our SRFormer achieves a 33.86dB PSNR score on the Urban100 dataset, which is 0.46dB higher than that of SwinIR.
Experiments show that our scaled model, named SRFormerV2, can further improve the results and achieves state-of-the-art.
arXiv Detail & Related papers (2023-03-17T02:38:44Z) - ClusTR: Exploring Efficient Self-attention via Clustering for Vision
Transformers [70.76313507550684]
We propose a content-based sparse attention method, as an alternative to dense self-attention.
Specifically, we cluster and then aggregate key and value tokens, as a content-based method of reducing the total token count.
The resulting clustered-token sequence retains the semantic diversity of the original signal, but can be processed at a lower computational cost.
arXiv Detail & Related papers (2022-08-28T04:18:27Z) - LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single
Image Super-Resolution and Beyond [75.37541439447314]
Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.
This paper proposes a linearly-assembled pixel-adaptive regression network (LAPAR) to strike a sweet spot of deep model complexity and resulting SISR quality.
arXiv Detail & Related papers (2021-05-21T15:47:18Z) - Self Sparse Generative Adversarial Networks [73.590634413751]
Generative Adversarial Networks (GANs) are an unsupervised generative model that learns data distribution through adversarial training.
We propose a Self Sparse Generative Adversarial Network (Self-Sparse GAN) that reduces the parameter space and alleviates the zero gradient problem.
arXiv Detail & Related papers (2021-01-26T04:49:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.