GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution
- URL: http://arxiv.org/abs/2408.07484v1
- Date: Wed, 14 Aug 2024 11:56:35 GMT
- Title: GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution
- Authors: Yuzhen Li, Zehang Deng, Yuxin Cao, Lihua Liu,
- Abstract summary: Grouped Residual Self-Attention (GRSA) is specifically oriented towards two fundamental components.
ES-RPB is a substitute for the original relative position bias to improve the ability to represent position information.
Experiments demonstrate GRFormer outperforms state-of-the-art transformer-based methods for $times$2, $times$3 and $times$4 SISR tasks.
- Score: 2.312414367096445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous works have shown that reducing parameter overhead and computations for transformer-based single image super-resolution (SISR) models (e.g., SwinIR) usually leads to a reduction of performance. In this paper, we present GRFormer, an efficient and lightweight method, which not only reduces the parameter overhead and computations, but also greatly improves performance. The core of GRFormer is Grouped Residual Self-Attention (GRSA), which is specifically oriented towards two fundamental components. Firstly, it introduces a novel grouped residual layer (GRL) to replace the Query, Key, Value (QKV) linear layer in self-attention, aimed at efficiently reducing parameter overhead, computations, and performance loss at the same time. Secondly, it integrates a compact Exponential-Space Relative Position Bias (ES-RPB) as a substitute for the original relative position bias to improve the ability to represent position information while further minimizing the parameter count. Extensive experimental results demonstrate that GRFormer outperforms state-of-the-art transformer-based methods for $\times$2, $\times$3 and $\times$4 SISR tasks, notably outperforming SOTA by a maximum PSNR of 0.23dB when trained on the DIV2K dataset, while reducing the number of parameter and MACs by about \textbf{60\%} and \textbf{49\% } in only self-attention module respectively. We hope that our simple and effective method that can easily applied to SR models based on window-division self-attention can serve as a useful tool for further research in image super-resolution. The code is available at \url{https://github.com/sisrformer/GRFormer}.
Related papers
- All You Need is an Improving Column: Enhancing Column Generation for Parallel Machine Scheduling via Transformers [0.0]
We present a neural network-enhanced column generation (CG) approach for a parallel machine scheduling problem.
By training the neural network offline and using it in inference mode to predict negative reduced costs columns, we achieve significant computational time savings.
For large-sized instances, the proposed approach achieves an 80% improvement in the objective value in under 500 seconds.
arXiv Detail & Related papers (2024-10-21T02:53:37Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - In defense of parameter sharing for model-compression [38.80110838121722]
randomized parameter-sharing (RPS) methods have gained traction for model compression at start of training.
RPS consistently outperforms/matches smaller models and all moderately informed pruning strategies.
This paper argues in favor of paradigm shift towards RPS based models.
arXiv Detail & Related papers (2023-10-17T22:08:01Z) - Towards an Effective and Efficient Transformer for Rain-by-snow Weather
Removal [23.224536745724077]
Rain-by-snow weather removal is a specialized task in weather-degraded image restoration aiming to eliminate coexisting rain streaks and snow particles.
We propose RSFormer, an efficient and effective Transformer that addresses this challenge.
RSFormer achieves the best trade-off between performance and time-consumption compared to other restoration methods.
arXiv Detail & Related papers (2023-04-06T04:39:23Z) - SRFormerV2: Taking a Closer Look at Permuted Self-Attention for Image Super-Resolution [74.48610723198514]
We present SRFormer, a simple but novel method that can enjoy the benefit of large window self-attention.
Our SRFormer achieves a 33.86dB PSNR score on the Urban100 dataset, which is 0.46dB higher than that of SwinIR.
Experiments show that our scaled model, named SRFormerV2, can further improve the results and achieves state-of-the-art.
arXiv Detail & Related papers (2023-03-17T02:38:44Z) - ClusTR: Exploring Efficient Self-attention via Clustering for Vision
Transformers [70.76313507550684]
We propose a content-based sparse attention method, as an alternative to dense self-attention.
Specifically, we cluster and then aggregate key and value tokens, as a content-based method of reducing the total token count.
The resulting clustered-token sequence retains the semantic diversity of the original signal, but can be processed at a lower computational cost.
arXiv Detail & Related papers (2022-08-28T04:18:27Z) - Pruning Self-attentions into Convolutional Layers in Single Path [89.55361659622305]
Vision Transformers (ViTs) have achieved impressive performance over various computer vision tasks.
We propose Single-Path Vision Transformer pruning (SPViT) to efficiently and automatically compress the pre-trained ViTs.
Our SPViT can trim 52.0% FLOPs for DeiT-B and get an impressive 0.6% top-1 accuracy gain simultaneously.
arXiv Detail & Related papers (2021-11-23T11:35:54Z) - LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single
Image Super-Resolution and Beyond [75.37541439447314]
Single image super-resolution (SISR) deals with a fundamental problem of upsampling a low-resolution (LR) image to its high-resolution (HR) version.
This paper proposes a linearly-assembled pixel-adaptive regression network (LAPAR) to strike a sweet spot of deep model complexity and resulting SISR quality.
arXiv Detail & Related papers (2021-05-21T15:47:18Z) - Self Sparse Generative Adversarial Networks [73.590634413751]
Generative Adversarial Networks (GANs) are an unsupervised generative model that learns data distribution through adversarial training.
We propose a Self Sparse Generative Adversarial Network (Self-Sparse GAN) that reduces the parameter space and alleviates the zero gradient problem.
arXiv Detail & Related papers (2021-01-26T04:49:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.