Self-Calibrated Efficient Transformer for Lightweight Super-Resolution
- URL: http://arxiv.org/abs/2204.08913v1
- Date: Tue, 19 Apr 2022 14:20:32 GMT
- Title: Self-Calibrated Efficient Transformer for Lightweight Super-Resolution
- Authors: Wenbin Zou, Tian Ye, Weixin Zheng, Yunchen Zhang, Liang Chen and Yi Wu
- Abstract summary: We present a lightweight Self-Calibrated Efficient Transformer (SCET) network to solve this problem.
The architecture of SCET mainly consists of the self-calibrated module and efficient transformer block.
We provide comprehensive results on different settings of the overall network.
- Score: 21.63691922827879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, deep learning has been successfully applied to the single-image
super-resolution (SISR) with remarkable performance. However, most existing
methods focus on building a more complex network with a large number of layers,
which can entail heavy computational costs and memory storage. To address this
problem, we present a lightweight Self-Calibrated Efficient Transformer (SCET)
network to solve this problem. The architecture of SCET mainly consists of the
self-calibrated module and efficient transformer block, where the
self-calibrated module adopts the pixel attention mechanism to extract image
features effectively. To further exploit the contextual information from
features, we employ an efficient transformer to help the network obtain similar
features over long distances and thus recover sufficient texture details. We
provide comprehensive results on different settings of the overall network. Our
proposed method achieves more remarkable performance than baseline methods. The
source code and pre-trained models are available at
https://github.com/AlexZou14/SCET.
Related papers
- LKFormer: Large Kernel Transformer for Infrared Image Super-Resolution [5.478440050117844]
We propose a potent Transformer model, termed Large Kernel Transformer (LKFormer) to capture infrared images.
This mainly employs depth-wise convolution with large kernels to execute non-local feature modeling.
We have devised a novel feed-forward network structure called Gated-Pixel Feed-Forward Network (GPFN) to augment the LKFormer's capacity to manage the information flow within the network.
arXiv Detail & Related papers (2024-01-22T11:28:24Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - DLGSANet: Lightweight Dynamic Local and Global Self-Attention Networks
for Image Super-Resolution [83.47467223117361]
We propose an effective lightweight dynamic local and global self-attention network (DLGSANet) to solve image super-resolution.
Motivated by the network designs of Transformers, we develop a simple yet effective multi-head dynamic local self-attention (MHDLSA) module to extract local features efficiently.
To overcome this problem, we develop a sparse global self-attention (SparseGSA) module to select the most useful similarity values.
arXiv Detail & Related papers (2023-01-05T12:06:47Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - ShuffleMixer: An Efficient ConvNet for Image Super-Resolution [88.86376017828773]
We propose ShuffleMixer, for lightweight image super-resolution that explores large convolution and channel split-shuffle operation.
Specifically, we develop a large depth-wise convolution and two projection layers based on channel splitting and shuffling as the basic component to mix features efficiently.
Experimental results demonstrate that the proposed ShuffleMixer is about 6x smaller than the state-of-the-art methods in terms of model parameters and FLOPs.
arXiv Detail & Related papers (2022-05-30T15:26:52Z) - Lightweight Bimodal Network for Single-Image Super-Resolution via
Symmetric CNN and Recursive Transformer [27.51790638626891]
Single-image super-resolution (SISR) has achieved significant breakthroughs with the development of deep learning.
To solve this issue, we propose a Lightweight Bimodal Network (LBNet) for SISR.
Specifically, an effective Symmetric CNN is designed for local feature extraction and coarse image reconstruction.
arXiv Detail & Related papers (2022-04-28T04:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.